![]() Based on this analysis, the article concludes with directions that are worth further research. Emphasis is given on methods that can be used for integrating several datasets. This article surveys the work that has been done in the area of Linked Data integration, it identifies the main actors and use cases, it analyzes and factorizes the integration process according to various dimensions, and it discusses the methods that are used in each step. In general, information integration is difficult, because (a) datasets are produced, kept, or managed by different organizations using different models, schemas, or formats, (b) the same real-world entities or relationships are referred with different URIs or names and in different natural languages, (c) datasets usually contain complementary information, (d) datasets can contain data that are erroneous, out-of-date, or conflicting, (e) datasets even about the same domain may follow different conceptualizations of the domain, (f) everything can change (e.g., schemas, data) as time passes. However, the major target of Linked Data, i.e., linking and integration, is not easy to achieve. We report experimental results about the efficiency of the transformed queries and discuss the benefits and the limitations of this query evaluation method.Ī large number of published datasets (or sources) that follow Linked Data principles is currently available and this number grows rapidly. ![]() Subsequently, we provide an algorithm for transforming answerable queries to SPARQL-LD queries that bypass the endpoints, as well as a method to estimate their evaluation cost which can be useful for deciding on the query execution strategy to follow. In this paper, we first provide a method for examining if a SPARQL query can be answered through zero-knowledge link traversal and analyse a large corpus of real SPARQL query logs for finding the frequency and distribution of answerable and non-answerable query patterns. While several approaches for such a lookup-based query evaluation method have been proposed, there exists no analysis of the types (patterns) of queries that can be directly answered on the Web of Data through a "zero-knowledge" approach, i.e., without accessing local or remote endpoints and without a-priori knowledge of available data sources. Link traversal has emerged as a SPARQL query processing method that exploits the Linked Data principles to dynamically discover data relevant for answering a query by dereferencing online Web resources (URIs) at query execution time. Finally, we show the usefulness of capturing Web fragments by providing examples in different knowledge domains. We report on the evaluation of swget and its comparison with related work. We discuss an implementation of the features of NAUTILOD in a tool called swget, which exploits current Web technologies and protocols. We present algorithms to implement such semantics and study their computational complexity. We provide a formalization of the NAUTILOD semantics, which captures both nodes and fragments of the Web of Linked Data. It also features a mechanism to specify actions (e.g., send notification messages) that obtain their parameters from datasources reached during the navigation. NAUTILOD enables one to specify datasources via the intertwining of navigation and querying capabilities. In this article we introduce a declarative navigational language for the Web of Linked Data graph called NAUTILOD. This new environment calls for formal languages and tools to automatize navigation across datasources (nodes in such graph) and enable semantic-aware and Web-scale search mechanisms. The Web of Linked Data is a huge graph of distributed and interlinked datasources fueled by structured information.
0 Comments
Leave a Reply. |