You're right -- changing anything with the input (snippets length, number of documents etc) will alter the clusters. This is basically how it works. If you want clustering in your search engine then, depending on the type of data you serve, you'll have to experiment with the settings a bit and see which give you satisfactory results. I don't think there is any particular reason to provide different data to the clusterer. Moreover, it'd complicate things quite badly.
Thanks Dawid for your response. In fact, I don't really want to change this, but just to be sure that everybody is aware about it and to have some opinions. Regards Jérôme -- http://motrech.free.fr/ http://www.frutch.org/