It might be of interest to some that in we switched from trying to index all in SPARQL to a mixed approach where all appears on the frontpage realtime but just selected Websites (rdf,rdfa, microformats, microdaa etc) + selected LOD datasets appear in a regularly updated (though not real time) appear in SPARQL.
This solution allows us to have a reasonable quality of service - while fitting in our limited research resources (as is a research project). By providing this service we intend to foster experimentation by the community that can now be sure that their favorite dataset is loaded (just send us a request) and can be queried e.g. in SPARQL next to their favorite web of data website (just make sure its in the list of those indexed or send us a request). Some details of this mechanism (and the fact that this made us process 100M rdf docs in a day) in this blog post. A UI making all more clear is coming in august. Thanks must go to Openlink for the support provided in setting this mechanism up and to the others mentioned in the blog post. Gio