You can store them and then use different analyzer chains on it (stored,
doesn't need to be indexed)

I'd probably use the collector pattern MatchAllDocsQuery(), new Collector() {
      private AtomicReader reader;
      private int i = 0;

      public boolean acceptsDocsOutOfOrder() {
        return true;


      public void collect(int i) {
        Document d;
        try {
          d = reader.document(i, fieldsToLoad);
          for (String f: fieldsToLoad) {
            String[] vals = d.getValues(f);
            for (String s: vals) {
              TokenStream ts = analyzer.tokenStream(targetAnalyzer,
new StringReader(s));
              while (ts.incrementToken()) {
                //do something with the analyzed tokens

        } catch (IOException e) {
          // pass


      public void setNextReader(AtomicReaderContext context) {
        this.reader = context.reader();

      public void setScorer( scorer) {
        // Do Nothing


    // or persist the data here if one of your components knows to
write to disk, but there is no api...
    TokenStream ts = analyzer.tokenStream(data.targetField, new


On Mon, May 27, 2013 at 9:37 AM, Furkan KAMACI <>wrote:

> Hi;
> I want to use Solr for an academical research. One step of my purpose is I
> want to store tokens in a file (I will store it at a database later) and I
> don't want to index them. For such kind of purposes should I use core
> Lucene or Solr? Is there an example for writing a custom analyzer and just
> storing tokens in a file?

Reply via email to