The OpenKnowledge project aims at knowledge sharing through open and flexible peer interactions. Within this project, we are developing a system that supports searching, developing and sharing of interactions/workflows consisting of roles implemented by software that can be shared and executed by peers. Its main requirements are openness, scalability, decentralization and robustness. Part of this system is a discovery service, which will be the focus of this work. This service aspires to fulfill the above requirements featuring a Peer-to-Peer architecture and Distributed Hash Tables (DHTs) to achieve robustness through redundancy and scalability through decentralization. Resources are discovered using a set of attribute-value pairs. A straightforward DHT-based approach that creates a distributed inverted index suffers from a linear increase of messages and replicas with the number of attributes. We try to reduce this number by proposing an efficient multi-attribute routing algorithm.
As a solution to the problem of managing large descriptions and multi- attribute search, this work is focused on popularity-based approaches. The key idea is that popular content is easily available on the network due to high degree of replication. Therefore, we do not need to spend much effort on indexing it, in contrast to rare items. It is intuitively true, and experimentally verified that for very popular items, we need no sophisticated routing, even a flooding approach would suffice.
Our approach is to use statistical information, which is automatically calculated in a distributed way, to determine, on-the-fly, which terms are rare and which queries refer to them, and adapt the routing process accordingly.