Stormrose Proposal:  GnutellaNG proposal

Goal

Here's my take on the goal of Gnet: "To provide a distributed, ad hoc service location network optimised for the sharing of files that is bandwidth efficient and uses simple to program protocols."

Breakdown

I've seen many proposals and ideas for what GnetullaNG should be - here's a framework that I think can be used so that we can see where these various proposals sit in relation to each other.  The protocol should be split into three groups - each with a different goal.  These are:
  1. Network Connectivity - connecting nodes to gnet.
  2. Service Location - searches and search replies
  3. Service Utilisation - server to client proprietary protocols

Network Connectivity

This section of the protocol is concerned with servents connecting to gnet.  It deals with the handshake process that new connections go through to connect to servent.  Currently this group includes: Handshake, PING and PONG.  However I believe that PING and PONG should be removed all together and replaced (more on this later).

The handshake can be used for a servent to refuse incoming connections.  They could reply something like: "Sorry I'm busy - but try these guys" followed by a list of IP/ports to try.  This busy response should at least include all the servents that this servent is currently connected to - however it can also include other servents from it's host catcher.

The handshake should also negotiate just what the servents are willing to do for each other.  For example: the handshake is the perfect time for a modem user to request that the servent it is connecting to proxy/cache all searches on it's behalf.

The handshake process needs to be extensible also so that people can add their own options to the handshake process without confusing other servents too much.  For instance an "elite group" might create a servent that provides priority services to connections that authenicate to it.  The authentication requests should be part of the handshake and the incoming connection should just return something in the handshake process if it doesn't understand a request. The point is that neither side should crash or behave erratically if it recieves a handshake option it doesn't understand.

Service Location

The goal of this protocol group is to locate services and return services that have been located already.  Currently this group includes Search Requests and Search Reply.  I think that the search request needs to become more generic so that it can locate more services than the current "download file" that the current gnet has.

For example the current gnet search query is a question that asks: "Which servents have files that that match pattern X?"  the fact that you are actually looking for an HTTP GET download service is simply implied in the protocol.  This makes the real question: "Which servent can supply me an HTTP GET download service for files that match parttern X?".

Add other services to gnet and very soon you can ask questions like: "Which servents that run KamikazeIM have MyFriend currently Online?", or Simpler: "Which servents are running KamikazeIM server".  The number of services can be extended very easily.

Servents should forward search requests and replies for services that they don't understand - but I think for the sanity sake there should be an agreed upon packet size limit for search requests.  Any search request bigger than this limit should be dropped.

This all means that a "ServiceType" should be added to current search requests.  This could either be number or a text string - either one would work just fine.  The search string should be cut into parameters too (param1=value1, param2=value2).

Service Utilisation

Once you've located a service that you need then you communicate with it.  This communication actually happens outside of gnet!  Currently this includes the HTTP GET download of a file you're looking for but in the future - when gnet becomes a service location network - this will be any protocol that you use to talk to the service that you've found.

The exception to this rule is a Service Specific Msg... some alternative method for a client to tell the service they cannot connect.  This is currently the PUSH request - but this too should be generalised for use by other services if they need it.  I don't actually see very many services needed to use this anyways.  Again this should have some packet size limit agreed to so that it isn't used as a general comms channel.
 

Tell me more about service location

Services can be implemented as plugins to a client.  All servents will route all legal service search requests, even if they have no idea what the service is.  Actually client software could use gnet to locate servers applicable to them.  Imagine ICQ, Napster, Freenet or mIRC using gnet to find a server to connect to?  Imagine <insert your server project here> is able to find others to talk to be looking for them over gnet?

Service Location also presents numerous opportunities for caching and efficient clustering on gnet.  If a servent is passing on a lot of search requests for a particular service then they can snoop the search replies for servents running that service - then connect directly to that servent - thus reducing the hops to reach that servent for their clients.  Servents could also only route service search requests out connections where they know a servent for that service will answer.

Services should return the most relevant simple information in their replies.  Remembering that replies must be small too!  For instance this question "Which list services can tell me an email address for Joe Bloggs?".. to which a servent might reply: "I'm a list servent running at x.x.x.x:y and Joe Bloggs' email address is jb@x.com".  For more information the client should directly connect to that List service using service specific protocols.
 

What was that about PING/PONGS?

Pings and Pongs must die!  They are pretty much unnecessary spam.  They serve only to fill host catchers with enough other hosts to connect too.  This can be done another way.  What about making making the PING/PONG replacement just a standard service query: eg:  "Which servents supply the gnet service?". Send this out with a low TTL and wait for the reply.
What are the advantages to doing it this way?  Well firstly it makes less primitives in the gnet protocol, secondly servents can use the same caching/proxy logic they use for any other service to handle this request.  This saves some double up on code.
 

This sounds great, now give me some examples

Here's a search for files that to download that include the words dr dre.  This is pretty much eqivalent to what the current gnet protocol does.
Query Service:
Gnutella Classice HTTP Get
Params:
pattern="dr dre"
Answer Host Info:
ip=x.x.x.x
port=yy
List:
name="dr dre-interview1.mp3" size="234112" etc.
name="dr dre-interview2.mp3" size="398122" etc.

Here's a search for KamikazeIM servers.  The answer is a prime candidate to cache.
Query Service:
KamikazeIM
Params:
Answer Host Info:
ip=x.x.x.x
port=yy
List:

Here's a search for KamikazeIM servers that have "MyFriend".
Query Service:
KamikazeIM
Params:
handle="MyFriend"
Answer Host Info:
ip=x.x.x.x
port=yy
List: 
handle="MyFriend"
email="mf@x.x"
status="Online"

Here's a search for KamikazeIM servers that have "MyFriend" currently Online.
Query Service:
KamikazeIM
Params:
handle="MyFriend"
status="Online"
Answer Host Info:
ip=x.x.x.x
port=yy
List: 
handle="MyFriend"
email="mf@x.x"
status="Online"

Here's a search for an email address from a List service.  It's asked for an email address from a J Bloggs and gets back three answers.
Query Service:
List
Params:
format="XML"
listname="Email addresses"
surname="Bloggs"
firstname="J*"
Answer Host Info:
ip=x.x.x.x
port=yy
List: 
emailaddress="jb@x.org"
firstname="Jack"
surname="Bloggs"

emailaddress="joeb@x.org"
firstname="Joe"
surname="Bloggs"

emailaddress="joblg@x.org"
firstname="Joanne"
surname="Bloggs"


 

Comments

Comments?  Tell me what you think of this idea?  Does it stink, does it sound logical? email me:
Stormrose [et@whanganui.ac.nz]