JAVA/ DOT NET PROJECT ABSTRACT 2016-2017
QUALITY-AWARE SUBGRAPH MATCHING OVER INCONSISTENT PROBABILISTIC GRAPH DATABASES
The Resource Description Framework (RDF) is a general framework for how to describe any Internet resource such as a Web site and its content. An RDF description (such descriptions are often referred to as metadata, or "data about data") can include the authors of the resource, date of creation or updating, the organization of the pages on a site (the sitemap), information that describes content in terms of audience or content rating, key words for search engine data collection, subject categories, and so forth.Resource Description Framework has been widely used in the Semantic Web to describe resources and their relationships. The RDF graph is one of the most commonly used representations for RDF data. However, in many real applications such as the data extraction/integration, RDF graphs integrated from different data sources may often contain uncertain and inconsistent information (e.g., uncertain labels or that violate facts/rules), due to the unreliability of data sources. In this paper, we formalize the RDF data by inconsistent probabilistic RDF graphs, which contain both inconsistencies and uncertainty. With such a probabilistic graph model, we focus on an important problem, quality-aware subgraph matching over inconsistent probabilistic RDF graphs , which retrieves subgraphs from inconsistent probabilistic RDF graphs that are isomorphic to a given query graph and with high quality scores (considering both consistency and uncertainty). In order to efficiently answer QA-gMatch queries, we provide two effective pruning methods, namely adaptive label pruning and quality score pruning, which can greatly filter out false alarms of subgraphs. We also design an effective index to facilitate our proposed pruning methods, and propose an efficient approach for processing QA-gMatch queries. Finally, we demonstrate the efficiency and effectiveness of our proposed approaches through extensive experiments.
Probabilistic graphs are often obtained from real-world applications such as the data extraction/integration in the SemanticWeb. Due to the unreliability of data sources or inaccurate extraction/integration techniques, probabilistic graph data often contain inconsistencies, violating some rules or facts. Here, rules or facts can be specified by knowledge base or inferred by data mining techniques. RDF graphs integrated from different data sources may often contain uncertain and inconsistent information , due to the unreliability of data sources. we formalize the RDF data by inconsistent probabilistic RDF graphs, which contain both inconsistencies and uncertainty. With such a probabilistic graph model, we focus on an important problem, quality-aware subgraph matching over inconsistent probabilistic RDF graphs (QA-gMatch), which retrieves subgraphs from inconsistent probabilistic RDF graphs that are isomorphic to a given query graph and with high quality scores (considering both consistency and uncertainty.
In this paper, we propose the quality-aware subgraph matching problem (namely, QA-gMatch) in a novel context of inconsistent probabilistic graphs G with quality guarantees. Specifically, given a query graph q, a QA-gMatch query retrieves subgraphs g of probabilistic graph G that match with q and have high quality scores. The QA-gMatch problem has many practical applications such as the Semantic Web. For example, we can answer standard queries, SPARQL queries, over inconsistent probabilistic RDF graphs by issuing QA-gMatch queries. we will propose effective pruning methods, namely adaptive label pruning (based on a cost model) and quality score pruning, to reduce the QAgMatch search space and improve the query efficiency.
• We propose the QA-gMatch problem in inconsistent probabilistic graphs, which, to our best knowledge, no prior work has studied.
• We carefully design effective pruning methods, adaptive label and quality score pruning, specific for inconsistent and probabilistic features of RDF graphs.
• We build a tree index over pre-computed data of inconsistent probabilistic graphs, and illustrate efficient QA-gMatch query procedure by traversing the index.
H/W System Configuration:-
Processor - Pentium –III
Speed - 1.1 Ghz
RAM - 256 MB(min)
Hard Disk - 20 GB
Key Board - Standard Windows Keyboard
Mouse - Two or Three Button Mouse
Monitor - SVGA
S/W System Configuration:-
Operating System :Windows95/98/2000/XP
Application Server : Tomcat5.0/6.X
Front End : HTML, Java, Jsp
Server side Script : Java Server Pages.
Database Connectivity : Mysql.