> You're both correct, after changing the type for tstamp and lastModified
> from long to date, no error anymore.
> 
> Next thing I need to do is setup cygwin/svn to be able to get fresh
> svn/trunch code...it's so cool to be up-to-date. Nutch-1.4 is just
> ridiculously faster than 1.2 :-)
> 

Is it faster? I read such a thing before somewhere on the list but i really 
don't know why it would be faster. Must be a case of bad settings in 1.2 i 
guess.



> Thanks!!
> 
> Remi
> 
> On Wed, Feb 15, 2012 at 9:14 PM, Markus Jelsma
> 
> <[email protected]>wrote:
> > That was likely an old schema. In trunk (or was it already in1.4) it is
> > of type date.
> > http://svn.apache.org/viewvc/nutch/trunk/conf/schema.xml?view=markup
> > 
> > > Remi, I had a similar problem but for a custom field that I was trying
> > > to post to Solr (via solrindex) as a type="date" in the schema.xml.
> > > Turns
> > 
> > out
> > 
> > > my date string was formatted incorrectly (it was missing the trailing
> > > Z). From the error message it appears that perhaps the field into
> > > which this field is going in is set as long or int. If you set it to
> > > type="date" it should take it (and you can do Solr's date arithmetic
> > > on it.
> > > 
> > > On Feb 15, 2012, at 11:01 AM, remi tassing wrote:
> > > > Awesome!
> > > > 
> > > > Pushing this to Solr gives me an error (solrindex):
> > > > SEVERE: java.lang.NumberFormatException: For input string:
> > > > "2012-02-08T14:40:09.416Z"
> > > > 
> > > >        at java.lang.NumberFormatException.forInputString(Unknown
> > 
> > Source)
> > 
> > > > But I'll try to figure this out on my own
> > > > 
> > > > I really appreciate your help!
> > > > 
> > > > Remi
> > > > 
> > > > On Wed, Feb 15, 2012 at 8:18 PM, Markus Jelsma
> > > > 
> > > > <[email protected]>wrote:
> > > >> sure, use the indexchecker tool.
> > > >> 
> > > >>> Is it any quick way to see the impact of index-more?  I deleted the
> > > >>> parse related folders in the segment and re-parsed it but when I
> > > >>> readseg there
> > > >> 
> > > >> is
> > > >> 
> > > >>> no.difference....
> > > >>> 
> > > >>> On Wednesday, February 15, 2012, Lewis John Mcgibbney <
> > > >>> 
> > > >>> [email protected]> wrote:
> > > >>>> Hi,
> > > >>>> 
> > > >>>> On Wed, Feb 15, 2012 at 4:00 PM, remi tassing <
> > 
> > [email protected]>
> > 
> > > >>> wrote:
> > > >>>>> tstamp shows a string of digits like 20020123123212
> > > >>>> 
> > > >>>> This is OK. yyyy-mm-dd-hh-mm-ssZ It is however hellishly old !
> > > >>>> 
> > > >>>>> Never heard of the plugin "index-more" and it's poorly
> > > >>>>> documented.
> > > >>>> 
> > > >>>> Well it's been included in 1.2 onwards so I'm very surprised @
> > > >>>> that. If
> > > >>> 
> > > >>> you
> > > >>> 
> > > >>>> feel like it then please feel free to add documentation, this is
> > > >>>> always something we are after and would be a great help to the
> > > >>>> community.
> > > >>>> 
> > > >>>> After
> > > >>>> 
> > > >>>>> adding this to plugins.include, I'll need to run solrindex or is
> > > >>>>> it necessary to re-parse or recrawl (I think this less likely
> > > >>>>> IMO)?
> > > >>>> 
> > > >>>> If you wish to have the fields we are able to extract with
> > 
> > index-more
> > 
> > > >>>> e.g.
> > > >>>> 
> > > >>>> <!-- fields for index-more plugin -->  81 <field name="type"
> > > >>>> type="string" stored="true" indexed="true"  82
> > > >>>> multiValued="true"/> 83 <field name="contentLength" type="long"
> > > >>>> stored="true"  84 indexed="false"/>  85
> > > >>> 
> > > >>> <field
> > > >>> 
> > > >>>> name="lastModified" type="long" stored="true"  86 indexed="true"/>
> >  
> >  87
> >  
> > > >>> <field
> > > >>> 
> > > >>>> name="date" type="string" stored="true" indexed="true"/>
> > > >>>> then you'll need to add the plugin, I would rebuild the project if
> > 
> > it
> > 
> > > >> is
> > > >> 
> > > >>>> possible but this is not essential, then index your content. And
> > 
> > yes I
> > 
> > > >>>> would expect the parsers need to be re-run to extract the
> > 
> > lastModified
> > 
> > > >>>> value from pages.

Reply via email to