you can call the same API as the admin UI does. Pass it strings, it
returns tokens in json/xml/whatever.

Upayavira

On Tue, Jun 30, 2015, at 06:55 PM, Dinesh Naik wrote:
> Hi Alessandro,
> 
> Lets say I have 20M documents with 50 fields in each. 
> 
> I have applied text analysis like compression,ngram,synonym expansion  on
> these fields. 
> 
> Checking individually field level analysis can be easily done via
> admin/analysis . But I need to do 50 times analysis check for these 50
> fields .
> 
> I wanted to know if solr provides a way to see all these analyzed fields
> at once (for ex. By using unique id ).
> 
> Best Regards,
> Dinesh Naik
> 
> -----Original Message-----
> From: "Alessandro Benedetti" <benedetti.ale...@gmail.com>
> Sent: ‎30-‎06-‎2015 21:43
> To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org>
> Subject: Re: Reading indexed data from solr 5.1.0 using admin/luke?
> 
> But what do you mean with the complete document ? Is it not available
> anymore ?
> So you have lost your original document and you want to try to
> reconstruct
> from the index ?
> 
> 2015-06-30 16:05 GMT+01:00 dinesh naik <dineshkumarn...@gmail.com>:
> 
> > Hi Alessandro,
> > I am able to check the field wise analyzed results.
> >
> > I was interested in getting the complete document.
> >
> > As Erick mentioned -
> > Reconstructing the doc from the
> > postings lists isactually quite tedious. The Luke program (not request
> > handler) has a
> > function that
> > does this, it's not fast though, more for troubleshooting than trying to do
> > anything in a production environment.
> >
> > I ll try looking into the Luke program if i can get this done.
> >
> > Thanks and Best Regards,
> > Dinesh Naik
> >
> > On Tue, Jun 30, 2015 at 7:42 PM, Alessandro Benedetti <
> > benedetti.ale...@gmail.com> wrote:
> >
> > > Do you have the original document available ? Or stored in the field of
> > > interest ?
> > > Should be quite an easy test to reproduce the Analysis simply using the
> > > analysis tool Upaya and Erick suggested.
> > > Just use your real document content and you will see how it is exactly
> > > analysed.
> > >
> > > Cheers
> > >
> > > 2015-06-30 15:03 GMT+01:00 dinesh naik <dineshkumarn...@gmail.com>:
> > >
> > > > Hi Erick,
> > > >
> > > > I agree with you.
> > > >
> > > > But i was checking if we could  get  hold on the whole document (to see
> > > all
> > > > analyzed field values) .
> > > >
> > > > There might be chances that field value is common for multiple
> > documents
> > > .
> > > > In such cases it will be difficult to backtrack which document has the
> > > > issue . Because admin/analysis can be used to see for field level
> > > analysis
> > > > only.
> > > >
> > > >
> > > >
> > > > Best Regards,
> > > > Dinesh Naik
> > > >
> > > > On Tue, Jun 30, 2015 at 7:08 PM, Erick Erickson <
> > erickerick...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Dinesh:
> > > > >
> > > > > This is what the admin/analysis page is for. It shows you exactly
> > > > > what tokens are produced by what steps in the analysis chain.
> > > > > That would be far better than trying to analyze the indexed
> > > > > terms.
> > > > >
> > > > > Best,
> > > > > Erick
> > > > >
> > > > > On Tue, Jun 30, 2015 at 8:35 AM, dinesh naik <
> > > dineshkumarn...@gmail.com>
> > > > > wrote:
> > > > > > Hi Erick,
> > > > > > This is mainly for debugging purpose. If i have 20M records and few
> > > > > fields
> > > > > > in some of the documents are not indexed as expected or something
> > > went
> > > > > > wrong during indexing then how do we pin point the exact issue and
> > > fix
> > > > > the
> > > > > > problem?
> > > > > >
> > > > > >
> > > > > > Best Regards,
> > > > > > Dinesh Naik
> > > > > >
> > > > > > On Tue, Jun 30, 2015 at 5:56 PM, Erick Erickson <
> > > > erickerick...@gmail.com
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > >> In short, not unless you want to get into low-level Lucene coding.
> > > > > >> Inverted indexes are, well, inverted so their very structure makes
> > > > > >> this difficult. It looks like this:
> > > > > >>
> > > > > >> But I'm not convinced yet that this isn't an XY problem. What is
> > the
> > > > > >> high-level problem you're trying to solve here? Maybe there's
> > > another
> > > > > >> way to go about it.
> > > > > >>
> > > > > >> Best,
> > > > > >> Erick
> > > > > >>
> > > > > >> On Tue, Jun 30, 2015 at 3:32 AM, dinesh naik <
> > > > dineshkumarn...@gmail.com
> > > > > >
> > > > > >> wrote:
> > > > > >> > Thanks Eric and Upayavira for your inputs.
> > > > > >> >
> > > > > >> > Is there a way i can associate this to a unique id of document,
> > > > either
> > > > > >> > using schema browser or TermsComponent?
> > > > > >> >
> > > > > >> > Best Regards,
> > > > > >> > Dinesh Naik
> > > > > >> >
> > > > > >> > On Tue, Jun 30, 2015 at 2:55 AM, Upayavira <u...@odoko.co.uk>
> > > wrote:
> > > > > >> >
> > > > > >> >> Use the schema browser on the admin UI, and click the "load
> > term
> > > > > info"
> > > > > >> >> button. It'll show you the terms in your index.
> > > > > >> >>
> > > > > >> >> You can also use the analysis tab which will show you how it
> > > would
> > > > > >> >> tokenise stuff for a specific field.
> > > > > >> >>
> > > > > >> >> Upayavira
> > > > > >> >>
> > > > > >> >> On Mon, Jun 29, 2015, at 06:53 PM, Dinesh Naik wrote:
> > > > > >> >> > Hi Eric,
> > > > > >> >> > By compressed value I meant value of a field after removing
> > > > special
> > > > > >> >> > characters . In my example its "-". Compressed form of
> > > red-apple
> > > > is
> > > > > >> >> > redapple .
> > > > > >> >> >
> > > > > >> >> > I wanted to know if we can see the analyzed version of
> > fields .
> > > > > >> >> >
> > > > > >> >> > For example if I use ngram on a field , how do I see the
> > > analyzed
> > > > > >> values
> > > > > >> >> > in index ?
> > > > > >> >> >
> > > > > >> >> >
> > > > > >> >> >
> > > > > >> >> >
> > > > > >> >> > -----Original Message-----
> > > > > >> >> > From: "Erick Erickson" <erickerick...@gmail.com>
> > > > > >> >> > Sent: ‎29-‎06-‎2015 18:12
> > > > > >> >> > To: "solr-user@lucene.apache.org" <
> > solr-user@lucene.apache.org
> > > >
> > > > > >> >> > Subject: Re: Reading indexed data from solr 5.1.0 using
> > > > admin/luke?
> > > > > >> >> >
> > > > > >> >> > Not quite sure what you mean by "compressed values".
> > admin/luke
> > > > > >> >> > doesn't show the results of the compression of the stored
> > > values,
> > > > > >> there's
> > > > > >> >> > no way I know of to do that.
> > > > > >> >> >
> > > > > >> >> > Best,
> > > > > >> >> > Erick
> > > > > >> >> >
> > > > > >> >> > On Mon, Jun 29, 2015 at 8:20 AM, dinesh naik <
> > > > > >> dineshkumarn...@gmail.com>
> > > > > >> >> > wrote:
> > > > > >> >> > > Hi all,
> > > > > >> >> > >
> > > > > >> >> > > Is there a way to read the indexed data for field on which
> > > the
> > > > > >> >> > > analysis/processing  has been done ?
> > > > > >> >> > >
> > > > > >> >> > > I know using admin GUI we can see field wise analysis But
> > how
> > > > > can i
> > > > > >> get
> > > > > >> >> > > hold on the complete document using admin/luke? or any
> > other
> > > > way?
> > > > > >> >> > >
> > > > > >> >> > > For example, if i have 2 fields called name and
> > > compressedname.
> > > > > >> >> > >
> > > > > >> >> > > name has values like apple, green-apple,red-apple
> > > > > >> >> > > compressedname has values like apple,greenapple,redapple
> > > > > >> >> > >
> > > > > >> >> > > Even though i make both these field indexed=true and
> > > > stored=true
> > > > > >> >> > >
> > > > > >> >> > > I am not able to see the compressed values using
> > > > > >> >> admin/luke?id=<mydocid>
> > > > > >> >> > >
> > > > > >> >> > > in response i see something like this-
> > > > > >> >> > >
> > > > > >> >> > >
> > > > > >> >> > > <lst name="name ">
> > > > > >> >> > > <str name="type">string</str>
> > > > > >> >> > > <str name="schema">ITS--------------</str>
> > > > > >> >> > > <str name="flags">ITS--------------</str>
> > > > > >> >> > > <str name="value">GREEN-APPLE</str>
> > > > > >> >> > > <str name="internal">GREEN-APPLE</str>
> > > > > >> >> > > <float name="boost">1.0</float>
> > > > > >> >> > > <int name="docFreq">0</int>
> > > > > >> >> > > </lst>
> > > > > >> >> > > <lst name="compressedname">
> > > > > >> >> > > <str name="type">string</str>
> > > > > >> >> > > <str name="schema">ITS--------------</str>
> > > > > >> >> > > <str name="flags">ITS--------------</str>
> > > > > >> >> > > <str name="value">GREEN-APPLE</str>
> > > > > >> >> > > <str name="internal">GREEN-APPLE</str>
> > > > > >> >> > > <float name="boost">1.0</float>
> > > > > >> >> > > <int name="docFreq">0</int>
> > > > > >> >> > > </lst>
> > > > > >> >> > >
> > > > > >> >> > >
> > > > > >> >> > >
> > > > > >> >> > > --
> > > > > >> >> > > Best Regards,
> > > > > >> >> > > Dinesh Naik
> > > > > >> >>
> > > > > >> >
> > > > > >> >
> > > > > >> >
> > > > > >> > --
> > > > > >> > Best Regards,
> > > > > >> > Dinesh Naik
> > > > > >>
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Best Regards,
> > > > > > Dinesh Naik
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Best Regards,
> > > > Dinesh Naik
> > > >
> > >
> > >
> > >
> > > --
> > > --------------------------
> > >
> > > Benedetti Alessandro
> > > Visiting card : http://about.me/alessandro_benedetti
> > >
> > > "Tyger, tyger burning bright
> > > In the forests of the night,
> > > What immortal hand or eye
> > > Could frame thy fearful symmetry?"
> > >
> > > William Blake - Songs of Experience -1794 England
> > >
> >
> >
> >
> > --
> > Best Regards,
> > Dinesh Naik
> >
> 
> 
> 
> -- 
> --------------------------
> 
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
> 
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
> 
> William Blake - Songs of Experience -1794 England

Reply via email to