Universität Tübingen  Fakultät > WilhelmSchickardInstitut > Algorithmik > Forschung > Abgeschlossene Arbeiten > Analysis of hypertext structures 

DFGProject: WWW Analysis and VisualizationA research project of the Schwerpunkt Programm Nr. 1126: Algorithmik grosser und komplexer Netzwerke Sitemap Generation  Clustering of the Web Graph  People Sitemap GenerationAs the work of a thesis, a prototype for the automated generation of sitemaps was implemented. The system supports structure and usage analysis of an site for its sitemap generation. More detailed information is available at the project's page. Clustering of the Web GraphThe web graph is the graph you get, if you take each web page as node and each hyperlink as edge of the graph. This link structure, as being manually generated by humans, carries a lot of information about the topic relationship between the web pages. As the web graph is very large, with billions of nodes and edges, an mechanism must be found to extract information from such a huge amount of data. An obvious approach now is to sort the pages by topic, like the way i.e. yahoo does. To do this, large sub graphs of the web graph need to be retrieved, that are very strong related. This process is usually called clustering. Unfortunately all clustering problems of interest are NPcomplete. Therefore heuristics and approximation algorithms need to be used. The goal of this part of the project is, to first find a good clustering problem for extracting topic relationships and to efficiently compute it on the web graph. If possible, special properties of the web graph (spareness, power law distributions) should be used. People

Anregungen / Kritik Impressum 