184.729 Semantic Technologies
This course is in all assigned curricula part of the STEOP.
This course is in at least 1 assigned curriculum part of the STEOP.

2020W, VU, 2.0h, 3.0EC, to be held in blocked form
TUWEL

Properties

  • Semester hours: 2.0
  • Credits: 3.0
  • Type: VU Lecture and Exercise
  • Format: Online

Learning outcomes

After successful completion of the course, students are able to: 

(1) Asses how data from different domains can be viewed as a graph, compare graph-structured representations of data, and translate such representations to and from related data modeling  formalisms like relational databases and RDF triple stores. (2)  Write, read, and understand specifications of domain knowledge for a given graph data set using RDF-S, SHACL, and different variants of the OWL profiles.  (3) Formulate information needs as queries over knowledge graphs using adequate query languages: conjunctive queries, unions of conjunctive queries, and their extensions with regular path expressions.  (4) Identify which query languages can express a given query, and to transform queries between representations.  (5) Chose among the mentioned specification formalisms which is adequate for different settings, taking into account the completeness of data, information needs, and expected computational efficiency of access. (6) Describe an algorithm for validating a knowledge graph comprising knowledge in any of these languages, and for accessing the data. For example inputs of moderate size, the student will be able to answer queries or validate targets manually. (7) Compare the studied querying and validation algorithms, argue which techniques are adequate for which formalisms, and provide an explanation of their computational complexity.  (8) Given an inconsistent knowledge graph, list its repairs and compute the answers to a given query over standard variations of the repair semantics.  (9) Given a description of a domain and of some possibly heterogeneous incomplete data sources, write an OBDI specification to construct a virtual knowledge graph.  (10) Given a query and an OBDI specification, explain how to compute the answers over the represented virtual knowledge graphs. 

 

 

Subject of course

The course studies several semantic technologies and the way they can be used for integrating and accessing data, especially data that cannot be easily handled with legacy techniques because it may be incomplete, inconsistent, or heterogeneous, and expensive to integrate and maintain. We will study specification languages like RDF-S, SHACL, and the OWL profiles, as well as the SPARQL query language. These formalisms are studied in some detail, comparing their abstract syntax, their semantic assumptions, and their core algorithms for validation and query evaluation. We will see how RDF-S, SHACL, and OWL can be used to validate graph data, and to obtain useful knowledge graphs from data that may be incomplete and heterogeneous. We will study how these graphs can be queried in (fragments of) the SPARQL query language,  some of the algorithmic and computational challenges that result from different choices of formalisms, and solutions for querying both virtual and inconsistent knowledge graphs.  

Part 1:  Querying graphs 

  Topic 1: Data Model for Graph-structured Data(GSD) 1.1  The basic model of GSD

  • Definition of a GSD instance
  • Representations of GSD instances(as labeled graphs and as set of atoms,  conversions) 
  • The GSD model as a special case of the relational model

1.2 Property graphs 

  • Property Graphs: definition, property graphs as relational instances  

  Topic 2:  Querying graphs 

  • Definitions of CQs, UCQs, 2RPQs, C2RPQs
  • Comparison of query languages, expressiveness  
  • Syntactic variants for representing queries 

2.1 Query answering 

  • Query answers: definition, certain answers, decision problems 
  • Basics of query evaluation over a GSD instances

2.2 Complexity of query answering

  • definition of data and combined complexity
  • complexity bounds 

  Topic 3:  Querying RDF graphs 

  • RDF triples and graphs
  • GSD vs RDF graphs, RDF ABoxes
  • BGPs and CQ-BGPs, syntactic representations
  • Evaluating CQ-BGP over RDF ABoxes: completion rules 
  • Combined and data complexity of evaluating CQ-BGPs over RDF ABoxes

 

Part 2:  Ontologies and knowledge graphs  

  Topic 4:  Querying with Knowledge 

  • RDF-S ontologies, abstract syntax, knowledge graphs(KGs)
  • Definition of models for KGs
  • Query evaluation with RDF-S ontologies, completion rules 
  • Complexity of query evaluation 

 3.1 Inconsistencty 

  • Disjointness assertions in RDF
  • Data consistency w.r.t an ontology 
  • Testing consistency: definition, algorithms, and complexity
  •  

  Topic 5: OWL Ontologies 

  • DL-like syntax for OWL QL, OWL EL, OWL 1 Lite, OWL 2 
  • Formal semantics of OWL: definition of interpretations, modelhood 
  • Reasoning tasks: class satisfiability, class subsumption, 
  • Writing OWL ontologies 
  • Comparing profiles 
  • Query answering in QL: algorithms and complexity
  • Query answering in EL: algorithms and complexity
  • Beyond the lightweight profiles: expressiveness and complexity 

 

Part 3: Variations

  Topic 6: SHACL 

  •  SHACL schemas, abstract syntax
  • SHACL validation  
  • Comparing OWL and SHACL

  Topic 7:  Ontology-based data integration and virtual KGs

  • OBDI and VKGs: goals and principles
  • OBDI specifications and an OBDI instances, models 
  • Certain answers: definition
  • Query evaluation pipeline, basic algorithms 

     

Teaching methods

The lecture will have different components.

(1) *Lecture videos* (asynchronous)
Video presentations of the lecture materials will be posted online every week for asynchronous watching.

(2)*Online discussion*
(attendance optional;
synchronous - tentative schedule Tue 9:00-10:00)
In a weekly zoom meeting, I will comment on the materials posted the week before, answer questions, and discuss exercises.
In some of these meetings there will be online quizzes (I will announce them a week before), which are optional as an alternative to the exam.

(3) *Exercises*
Optional exercises will be posted, to be solved independently. I will provide feedback on the solutions I receive. Correct solutions can also contribute to the mark, as an alternative to the exam.

 



 

Mode of examination

Written and oral

Additional information

ECTS breakdown: 3 ECTS = 75 hours

Lectures: 20 hours
Exercises and self-study:  53 hours
Exam: 2 hours

Lecturers

Institute

Course dates

DayTimeDateLocationDescription
Tue09:00 - 10:0013.10.2020 - 26.01.2021 Zoom meeting (TUWEL) (LIVE)Online discussion of lecture materials
Semantic Technologies - Single appointments
DayDateTimeLocationDescription
Tue13.10.202009:00 - 10:00 Zoom meeting (TUWEL)Online discussion of lecture materials
Tue20.10.202009:00 - 10:00 Zoom meeting (TUWEL)Online discussion of lecture materials
Tue27.10.202009:00 - 10:00 Zoom meeting (TUWEL)Online discussion of lecture materials
Tue03.11.202009:00 - 10:00 Zoom meeting (TUWEL)Online discussion of lecture materials
Tue10.11.202009:00 - 10:00 Zoom meeting (TUWEL)Online discussion of lecture materials
Tue17.11.202009:00 - 10:00 Zoom meeting (TUWEL)Online discussion of lecture materials
Tue24.11.202009:00 - 10:00 Zoom meeting (TUWEL)Online discussion of lecture materials
Tue01.12.202009:00 - 10:00 Zoom meeting (TUWEL)Online discussion of lecture materials
Tue15.12.202009:00 - 10:00 Zoom meeting (TUWEL)Online discussion of lecture materials
Tue12.01.202109:00 - 10:00 Zoom meeting (TUWEL)Online discussion of lecture materials
Tue19.01.202109:00 - 10:00 Zoom meeting (TUWEL)Online discussion of lecture materials
Tue26.01.202109:00 - 10:00 Zoom meeting (TUWEL)Online discussion of lecture materials
Course is held blocked

Examination modalities

*Grades*
You can choose between two forms of grading: 
(A) You can get your grade during the semester, showing that you master the material of all the course units by means of exercises and quizzes. 
(B) An exam at the end of the semester (written + oral). 
In model (A) you can retake and improve grades or previous units.

More details will be announced in the introductory lecture on Tuesday, 13.10., 9:00 on Zoom.

Course registration

Begin End Deregistration end
19.08.2020 09:00 16.11.2020 23:00

Curricula

Study CodeObligationSemesterPrecon.Info
066 645 Data Science Not specified
066 926 Business Informatics Mandatory elective
066 931 Logic and Computation Mandatory elective

Literature

No lecture notes are available.

Accompanying courses

Language

English