Re: Nutch 1.3 release

2011-04-14 Thread Julien Nioche
There has been a large number of substantial changes with 1.3 (search
delegated to SOLR, separation between local and distributed runtimes, )
and we'll need to reflect this in the documentation the site and the wiki.
The good news is that a lot of this will be relevant for 2.0 as well.

BTW thanks for your work on cleaning up old issues on JIRA

On 13 April 2011 23:13, Markus Jelsma  wrote:

> Hi,
>
> There are 4 open issue's for 1.3, 2 are already fixed in 1.3 of which 1 is
> ready to commit for trunk the other is fixing license headers for trunk.
> Two
> very small issues remain which i can fix within the next few days.
>
> Beyond that, i'll at least do a clean build and do a complete crawl with
> some
> plugins enabled do see if anything else goes wrong.
>
> Whats's up next? I read Chris volunteers to manage the release, that would
> be
> great, but what else is there to do for 1.3?
>
> Cheers,
>



-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com


[jira] [Resolved] (NUTCH-980) Fix IllegalAccessError with slf4j used in Solrj.

2011-04-14 Thread Markus Jelsma (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma resolved NUTCH-980.
-

Resolution: Fixed

Committed for trunk in rev. 1092062.

> Fix IllegalAccessError with slf4j used in Solrj.
> 
>
> Key: NUTCH-980
> URL: https://issues.apache.org/jira/browse/NUTCH-980
> Project: Nutch
>  Issue Type: Bug
>  Components: build
>Affects Versions: 1.3
>Reporter: Markus Jelsma
>Assignee: Markus Jelsma
>Priority: Blocker
> Fix For: 1.3, 2.0
>
> Attachments: NUTCH-980-1.3.patch, NUTCH-980-trunk.patch
>
>
> Currently Solr commands fail because of:
>  Exception in thread "main" java.lang.IllegalAccessError: tried to 
>  access field org.slf4j.impl.StaticLoggerBinder.SINGLETON from class 
>  org.slf4j.LoggerFactory
>  at 
>  org.slf4j.LoggerFactory.staticInitialize(LoggerFactory.java:83)
>  at org.slf4j.LoggerFactory.(LoggerFactory.java:73)
>  at 
>  
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.(CommonsHttpSolrServer.java:78)
> Julien looked it up http://www.slf4j.org/faq.html#IllegalAccessError , we 
> need to change the versions in Ivy. I haven't yet come around to test it with 
> trunk so we need to look for it there as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (NUTCH-976) SolrIndex constants in wrong namespace (or prefix)

2011-04-14 Thread Markus Jelsma (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated NUTCH-976:


Attachment: NUTCH-976-1.3-2.patch
NUTCH-976-trunk-2.patch

Patches for 1.3 and trunk. solr.commit.size has been added to nutch-default and 
solrindex.mapping is now solr.mapping. Mapping is now also added to 
SolrConstants.

> SolrIndex constants in wrong namespace (or prefix)
> --
>
> Key: NUTCH-976
> URL: https://issues.apache.org/jira/browse/NUTCH-976
> Project: Nutch
>  Issue Type: Bug
>  Components: indexer
>Affects Versions: 1.2, 1.3, 2.0
>Reporter: Markus Jelsma
>Assignee: Markus Jelsma
> Fix For: 1.3, 2.0
>
> Attachments: NUTCH-976-1.3-1.patch, NUTCH-976-1.3-2.patch, 
> NUTCH-976-1.3-trunk.patch, NUTCH-976-trunk-2.patch
>
>
> The shipped nutch-default.xml configuration file uses solrindex. as namespace 
> for configuration parameters but the namespace (or prefix) in SolrConstants 
> is solr instead. It should be solrindex.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (NUTCH-976) SolrIndex constants in wrong namespace (or prefix)

2011-04-14 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019749#comment-13019749
 ] 

Markus Jelsma commented on NUTCH-976:
-

All seems to be alright now for trunk and 1.3, any further objections for this 
issue and NUTCH-977?

> SolrIndex constants in wrong namespace (or prefix)
> --
>
> Key: NUTCH-976
> URL: https://issues.apache.org/jira/browse/NUTCH-976
> Project: Nutch
>  Issue Type: Bug
>  Components: indexer
>Affects Versions: 1.2, 1.3, 2.0
>Reporter: Markus Jelsma
>Assignee: Markus Jelsma
> Fix For: 1.3, 2.0
>
> Attachments: NUTCH-976-1.3-1.patch, NUTCH-976-1.3-2.patch, 
> NUTCH-976-1.3-trunk.patch, NUTCH-976-trunk-2.patch
>
>
> The shipped nutch-default.xml configuration file uses solrindex. as namespace 
> for configuration parameters but the namespace (or prefix) in SolrConstants 
> is solr instead. It should be solrindex.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (NUTCH-975) Fix missing/wrong headers in source files

2011-04-14 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019759#comment-13019759
 ] 

Markus Jelsma commented on NUTCH-975:
-

Great stuff Julien! I'll also add the header for bin/nutch which is also 
missing and get ready to commit.

> Fix missing/wrong headers in source files
> -
>
> Key: NUTCH-975
> URL: https://issues.apache.org/jira/browse/NUTCH-975
> Project: Nutch
>  Issue Type: Task
>Affects Versions: 1.3, 2.0
>Reporter: Markus Jelsma
>Priority: Blocker
> Fix For: 1.3, 2.0
>
> Attachments: NUTCH-975-1.3.patch, NUTCH-975-2.0.patch
>
>
> It seems several source files still do not contain the proper ASL headers. 
> This includes older core in 1.3 (indexer.NutchField etc) and recent code in 
> 2.0 (API for instance). This should be fixed (yet again). So if you spot one 
> ;)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (NUTCH-976) SolrIndex constants in wrong namespace (or prefix)

2011-04-14 Thread Julien Nioche (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019758#comment-13019758
 ] 

Julien Nioche commented on NUTCH-976:
-

Looks good to me, thanks Markus

> SolrIndex constants in wrong namespace (or prefix)
> --
>
> Key: NUTCH-976
> URL: https://issues.apache.org/jira/browse/NUTCH-976
> Project: Nutch
>  Issue Type: Bug
>  Components: indexer
>Affects Versions: 1.2, 1.3, 2.0
>Reporter: Markus Jelsma
>Assignee: Markus Jelsma
> Fix For: 1.3, 2.0
>
> Attachments: NUTCH-976-1.3-1.patch, NUTCH-976-1.3-2.patch, 
> NUTCH-976-1.3-trunk.patch, NUTCH-976-trunk-2.patch
>
>
> The shipped nutch-default.xml configuration file uses solrindex. as namespace 
> for configuration parameters but the namespace (or prefix) in SolrConstants 
> is solr instead. It should be solrindex.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (NUTCH-975) Fix missing/wrong headers in source files

2011-04-14 Thread Markus Jelsma (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated NUTCH-975:


Attachment: NUTCH-975-trunk-bin.patch

Here's the patch for bin/nutch in trunk.

> Fix missing/wrong headers in source files
> -
>
> Key: NUTCH-975
> URL: https://issues.apache.org/jira/browse/NUTCH-975
> Project: Nutch
>  Issue Type: Task
>Affects Versions: 1.3, 2.0
>Reporter: Markus Jelsma
>Priority: Blocker
> Fix For: 1.3, 2.0
>
> Attachments: NUTCH-975-1.3.patch, NUTCH-975-2.0.patch, 
> NUTCH-975-trunk-bin.patch
>
>
> It seems several source files still do not contain the proper ASL headers. 
> This includes older core in 1.3 (indexer.NutchField etc) and recent code in 
> 2.0 (API for instance). This should be fixed (yet again). So if you spot one 
> ;)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (NUTCH-975) Fix missing/wrong headers in source files

2011-04-14 Thread Markus Jelsma (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma resolved NUTCH-975.
-

Resolution: Fixed
  Assignee: Markus Jelsma

Everything builds and runs fine with these patches. I've committed this for 
trunk in rev. 1092082. Thanks Julien for checking trunk!

> Fix missing/wrong headers in source files
> -
>
> Key: NUTCH-975
> URL: https://issues.apache.org/jira/browse/NUTCH-975
> Project: Nutch
>  Issue Type: Task
>Affects Versions: 1.3, 2.0
>Reporter: Markus Jelsma
>Assignee: Markus Jelsma
>Priority: Blocker
> Fix For: 1.3, 2.0
>
> Attachments: NUTCH-975-1.3.patch, NUTCH-975-2.0.patch, 
> NUTCH-975-trunk-bin.patch
>
>
> It seems several source files still do not contain the proper ASL headers. 
> This includes older core in 1.3 (indexer.NutchField etc) and recent code in 
> 2.0 (API for instance). This should be fixed (yet again). So if you spot one 
> ;)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (NUTCH-976) Rename properties solrindex.* to solr.*

2011-04-14 Thread Markus Jelsma (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated NUTCH-976:


Description: 
All Solr properties are now consistently using solr.* instead of solrindex.*. 
This has been changed for solrindex.mapping.file which was not configurable at 
all.

Was: The shipped nutch-default.xml configuration file uses solrindex. as 
namespace for configuration parameters but the namespace (or prefix) in 
SolrConstants is solr instead. It should be solrindex.

  was:The shipped nutch-default.xml configuration file uses solrindex. as 
namespace for configuration parameters but the namespace (or prefix) in 
SolrConstants is solr instead. It should be solrindex.

Summary: Rename properties solrindex.* to solr.*   (was: SolrIndex 
constants in wrong namespace (or prefix))

> Rename properties solrindex.* to solr.* 
> 
>
> Key: NUTCH-976
> URL: https://issues.apache.org/jira/browse/NUTCH-976
> Project: Nutch
>  Issue Type: Bug
>  Components: indexer
>Affects Versions: 1.2, 1.3, 2.0
>Reporter: Markus Jelsma
>Assignee: Markus Jelsma
> Fix For: 1.3, 2.0
>
> Attachments: NUTCH-976-1.3-1.patch, NUTCH-976-1.3-2.patch, 
> NUTCH-976-1.3-trunk.patch, NUTCH-976-trunk-2.patch
>
>
> All Solr properties are now consistently using solr.* instead of solrindex.*. 
> This has been changed for solrindex.mapping.file which was not configurable 
> at all.
> Was: The shipped nutch-default.xml configuration file uses solrindex. as 
> namespace for configuration parameters but the namespace (or prefix) in 
> SolrConstants is solr instead. It should be solrindex.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (NUTCH-976) Rename properties solrindex.* to solr.*

2011-04-14 Thread Markus Jelsma (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma resolved NUTCH-976.
-

Resolution: Fixed

Fixed in 1.3 in rev 1092084 and for trunk in rev 1092085. Thanks Julien for 
commenting.

> Rename properties solrindex.* to solr.* 
> 
>
> Key: NUTCH-976
> URL: https://issues.apache.org/jira/browse/NUTCH-976
> Project: Nutch
>  Issue Type: Bug
>  Components: indexer
>Affects Versions: 1.2, 1.3, 2.0
>Reporter: Markus Jelsma
>Assignee: Markus Jelsma
> Fix For: 1.3, 2.0
>
> Attachments: NUTCH-976-1.3-1.patch, NUTCH-976-1.3-2.patch, 
> NUTCH-976-1.3-trunk.patch, NUTCH-976-trunk-2.patch
>
>
> All Solr properties are now consistently using solr.* instead of solrindex.*. 
> This has been changed for solrindex.mapping.file which was not configurable 
> at all.
> Was: The shipped nutch-default.xml configuration file uses solrindex. as 
> namespace for configuration parameters but the namespace (or prefix) in 
> SolrConstants is solr instead. It should be solrindex.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (NUTCH-977) SolrMappingReader uses hardcoded configuration parameter name for mapping file

2011-04-14 Thread Markus Jelsma (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma resolved NUTCH-977.
-

Resolution: Fixed

Committed for trunk in rev. 1092090 for 1.3 in rev. 1092091.

> SolrMappingReader uses hardcoded configuration parameter name for mapping file
> --
>
> Key: NUTCH-977
> URL: https://issues.apache.org/jira/browse/NUTCH-977
> Project: Nutch
>  Issue Type: Bug
>Affects Versions: 1.2, 1.3, 2.0
>Reporter: Markus Jelsma
>Assignee: Markus Jelsma
> Fix For: 1.3, 2.0
>
> Attachments: NUTCH-977-1.3.patch, NUTCH-977-trunk.patch
>
>
> Because the SolrMappingReader uses a hard coded value for the name of the 
> mapping file configuration parameter it actually works. It should rely on 
> SolrConstants instead of using a hard coded value.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Nutch 1.3 release

2011-04-14 Thread Julien Nioche
Guys,

Thanks to Markus' hard work we don't have any issues left for 1.3. I seem to
remember that Chris offered to take care of the release, Chris - are you
still OK to do it?
Any objections anyone?

Thanks

Julien
-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com


On 14 April 2011 08:55, Julien Nioche  wrote:

> There has been a large number of substantial changes with 1.3 (search
> delegated to SOLR, separation between local and distributed runtimes, )
> and we'll need to reflect this in the documentation the site and the wiki.
> The good news is that a lot of this will be relevant for 2.0 as well.
>
> BTW thanks for your work on cleaning up old issues on JIRA
>
> On 13 April 2011 23:13, Markus Jelsma  wrote:
>
>> Hi,
>>
>> There are 4 open issue's for 1.3, 2 are already fixed in 1.3 of which 1 is
>> ready to commit for trunk the other is fixing license headers for trunk.
>> Two
>> very small issues remain which i can fix within the next few days.
>>
>> Beyond that, i'll at least do a clean build and do a complete crawl with
>> some
>> plugins enabled do see if anything else goes wrong.
>>
>> Whats's up next? I read Chris volunteers to manage the release, that would
>> be
>> great, but what else is there to do for 1.3?
>>
>> Cheers,
>>
>
>
>
> --
> *
> *Open Source Solutions for Text Engineering
>
> http://digitalpebble.blogspot.com/
> http://www.digitalpebble.com
>


[jira] [Closed] (NUTCH-922) SolrWriter should log source fields that are not mapped

2011-04-14 Thread Markus Jelsma (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma closed NUTCH-922.
---

Resolution: Not A Problem

No problem, unmapped fields are written anyway.

> SolrWriter should log source fields that are not mapped
> ---
>
> Key: NUTCH-922
> URL: https://issues.apache.org/jira/browse/NUTCH-922
> Project: Nutch
>  Issue Type: Improvement
>  Components: indexer
>Reporter: Markus Jelsma
> Fix For: 2.0
>
>
> Currently the SolrWriter::write() method silently ignores source fields that 
> have no mapping to a Solr field. Fields that are ignored should be logged. 
> Any more thoughts on this one?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Precopy http.agent properties to nutch-site

2011-04-14 Thread Markus Jelsma
Hi guys,

Maybe a last convenience would be to precopy the mandatory http.agent 
properties to nutch-site. This would, in my opinion, encourage users not to 
set the properties in nutch-default but where it should, in nutch-site. 
Thoughts?

Cheers,
-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350


Re: Nutch 1.3 release

2011-04-14 Thread Markus Jelsma
Is the RM responsible for changing all 1.3-dev occurences to 1.3 such as in 
pom, default.properties and http.agent name? The wiki states that updating 
default.properties is part of releasing but what about -dev?

On Thursday 14 April 2011 09:55:32 Julien Nioche wrote:
> There has been a large number of substantial changes with 1.3 (search
> delegated to SOLR, separation between local and distributed runtimes, )
> and we'll need to reflect this in the documentation the site and the wiki.
> The good news is that a lot of this will be relevant for 2.0 as well.
> 
> BTW thanks for your work on cleaning up old issues on JIRA
> 
> On 13 April 2011 23:13, Markus Jelsma  wrote:
> > Hi,
> > 
> > There are 4 open issue's for 1.3, 2 are already fixed in 1.3 of which 1
> > is ready to commit for trunk the other is fixing license headers for
> > trunk. Two
> > very small issues remain which i can fix within the next few days.
> > 
> > Beyond that, i'll at least do a clean build and do a complete crawl with
> > some
> > plugins enabled do see if anything else goes wrong.
> > 
> > Whats's up next? I read Chris volunteers to manage the release, that
> > would be
> > great, but what else is there to do for 1.3?
> > 
> > Cheers,

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350


Re: Nutch 1.3 release

2011-04-14 Thread Mattmann, Chris A (388J)
I'm ready...

Should have some time Sat or Sun to cut an RC...

Cheers,
Chris

On Apr 14, 2011, at 3:27 AM, Julien Nioche wrote:

> Guys, 
> 
> Thanks to Markus' hard work we don't have any issues left for 1.3. I seem to 
> remember that Chris offered to take care of the release, Chris - are you 
> still OK to do it?
> Any objections anyone?
> 
> Thanks
> 
> Julien
> -- 
> 
> Open Source Solutions for Text Engineering
> 
> http://digitalpebble.blogspot.com/
> http://www.digitalpebble.com
> 
> 
> On 14 April 2011 08:55, Julien Nioche  wrote:
> There has been a large number of substantial changes with 1.3 (search 
> delegated to SOLR, separation between local and distributed runtimes, ) 
> and we'll need to reflect this in the documentation the site and the wiki. 
> The good news is that a lot of this will be relevant for 2.0 as well.
> 
> BTW thanks for your work on cleaning up old issues on JIRA 
> 
> On 13 April 2011 23:13, Markus Jelsma  wrote:
> Hi,
> 
> There are 4 open issue's for 1.3, 2 are already fixed in 1.3 of which 1 is
> ready to commit for trunk the other is fixing license headers for trunk. Two
> very small issues remain which i can fix within the next few days.
> 
> Beyond that, i'll at least do a clean build and do a complete crawl with some
> plugins enabled do see if anything else goes wrong.
> 
> Whats's up next? I read Chris volunteers to manage the release, that would be
> great, but what else is there to do for 1.3?
> 
> Cheers,
> 
> 
> 
> -- 
> 
> Open Source Solutions for Text Engineering
> 
> http://digitalpebble.blogspot.com/
> http://www.digitalpebble.com
> 
> 
> 
> 


++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++



Re: Nutch 1.3 release

2011-04-14 Thread Mattmann, Chris A (388J)
Yep, RM is responsible for this, no wors. If you guys can wait until Sat/Sun 
I'll roll the RC...

Cheers,
Chris

On Apr 14, 2011, at 3:52 AM, Markus Jelsma wrote:

> Is the RM responsible for changing all 1.3-dev occurences to 1.3 such as in 
> pom, default.properties and http.agent name? The wiki states that updating 
> default.properties is part of releasing but what about -dev?
> 
> On Thursday 14 April 2011 09:55:32 Julien Nioche wrote:
>> There has been a large number of substantial changes with 1.3 (search
>> delegated to SOLR, separation between local and distributed runtimes, )
>> and we'll need to reflect this in the documentation the site and the wiki.
>> The good news is that a lot of this will be relevant for 2.0 as well.
>> 
>> BTW thanks for your work on cleaning up old issues on JIRA
>> 
>> On 13 April 2011 23:13, Markus Jelsma  wrote:
>>> Hi,
>>> 
>>> There are 4 open issue's for 1.3, 2 are already fixed in 1.3 of which 1
>>> is ready to commit for trunk the other is fixing license headers for
>>> trunk. Two
>>> very small issues remain which i can fix within the next few days.
>>> 
>>> Beyond that, i'll at least do a clean build and do a complete crawl with
>>> some
>>> plugins enabled do see if anything else goes wrong.
>>> 
>>> Whats's up next? I read Chris volunteers to manage the release, that
>>> would be
>>> great, but what else is there to do for 1.3?
>>> 
>>> Cheers,
> 
> -- 
> Markus Jelsma - CTO - Openindex
> http://www.linkedin.com/in/markus17
> 050-8536620 / 06-50258350


++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++



Re: Nutch 1.3 release

2011-04-14 Thread Markus Jelsma
Would be great. I'm able to test next week.

On Thursday 14 April 2011 16:30:49 Mattmann, Chris A (388J) wrote:
> Yep, RM is responsible for this, no wors. If you guys can wait until
> Sat/Sun I'll roll the RC...
> 
> Cheers,
> Chris
> 
> On Apr 14, 2011, at 3:52 AM, Markus Jelsma wrote:
> > Is the RM responsible for changing all 1.3-dev occurences to 1.3 such as
> > in pom, default.properties and http.agent name? The wiki states that
> > updating default.properties is part of releasing but what about -dev?
> > 
> > On Thursday 14 April 2011 09:55:32 Julien Nioche wrote:
> >> There has been a large number of substantial changes with 1.3 (search
> >> delegated to SOLR, separation between local and distributed runtimes,
> >> ) and we'll need to reflect this in the documentation the site and
> >> the wiki. The good news is that a lot of this will be relevant for 2.0
> >> as well.
> >> 
> >> BTW thanks for your work on cleaning up old issues on JIRA
> >> 
> >> On 13 April 2011 23:13, Markus Jelsma  wrote:
> >>> Hi,
> >>> 
> >>> There are 4 open issue's for 1.3, 2 are already fixed in 1.3 of which 1
> >>> is ready to commit for trunk the other is fixing license headers for
> >>> trunk. Two
> >>> very small issues remain which i can fix within the next few days.
> >>> 
> >>> Beyond that, i'll at least do a clean build and do a complete crawl
> >>> with some
> >>> plugins enabled do see if anything else goes wrong.
> >>> 
> >>> Whats's up next? I read Chris volunteers to manage the release, that
> >>> would be
> >>> great, but what else is there to do for 1.3?
> >>> 
> >>> Cheers,
> 
> ++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: chris.a.mattm...@nasa.gov
> WWW:   http://sunset.usc.edu/~mattmann/
> ++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350


[jira] [Commented] (NUTCH-422) index-extra plugin creates additional fields in the index, based on configurable logic

2011-04-14 Thread Dietrich Schmidt (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019982#comment-13019982
 ] 

Dietrich Schmidt commented on NUTCH-422:


This is not compatible with Nutch 2. I made a quick attempt to refactor, but it 
is too complex without
good understanding of the Nutch architecture. Anyone else tried their luck?


> index-extra plugin creates additional fields in the index, based on 
> configurable logic
> --
>
> Key: NUTCH-422
> URL: https://issues.apache.org/jira/browse/NUTCH-422
> Project: Nutch
>  Issue Type: New Feature
>  Components: indexer
>Affects Versions: 0.8.1
> Environment: All environments
>Reporter: Alan Tanaman
>Assignee: Sami Siren
> Attachments: ExtraIndexingFilter.java, 
> index-extra-v1.0-bin-java1.5.zip, index-extra-v1.0-source.zip
>
>
> Extract from the Readme file:
> A.  Introduction
> The index-extra plugin allows you to configure additional fields that you 
> wish to be added to the index, based on one of the following sources:
>   - The parsed text
>   - Meta data fields
>   - Previously created document-to-be-indexed fields
>   - Plain constant string
>   - Java expression combining one or more of the above, and resolving to 
> a string
> A regex can also be applied to any of the above, allowing fields to be 
> created based on patterns extracted from the source.
> B.  Installation
> 1)  Binaries only:  Copy the 'index-extra' folder within 
> index-extra-v1.0-bin-java1.5.zip to NUTCHDIR/build
> Copy the 'index-extra-conf.xml' file to 
> NUTCHDIR/conf, and configure
> Enable the plugin by updating the nutch-site.xml file
> 2)  Source code:Always refer to the Nutch wiki for detailed 
> instructions on building Nutch.  In short:
> Copy the 'index-extra' folder within 
> index-extra-v1.0-source.zip to NUTCHDIR/src/plugin
> Update the build.xml in NUTCHDIR/src/plugin to 
> include plugin
> Update the NUTCHDIR/default.properties file to 
> include plugin
> run ant to build
> Copy the 'index-extra-conf.xml' file to 
> NUTCHDIR/conf, and configure
> Enable the plugin by updating the nutch-site.xml file
> C.  Known Issues
> 1)  For this plugin to work correctly on any document field, it is 
> necessary to run the other index filters
> first, so that all basic document fields are generated first.  To do 
> this, configure the indexingfilter.order
> property.  (Please see patch NUTCH-421 to enable indexingfilter.order 
> property. If this patch is not applied,
> the plugin will still work, but will not be able to use document fields 
> created by other index filter plugins.)
> 2)  At this stage, field boost can not be used as Nutch scoring overrides 
> the field boost with its own
> document-level boost calculation.  This occurs at the end of 
> org.apache.nutch.indexer.Indexer's reduce method.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Nutch-trunk #1457

2011-04-14 Thread Apache Hudson Server
See 

--
[...truncated 1013 lines...]
A src/plugin/subcollection/src/java/org/apache/nutch/indexer
A 
src/plugin/subcollection/src/java/org/apache/nutch/indexer/subcollection
A 
src/plugin/subcollection/src/java/org/apache/nutch/indexer/subcollection/SubcollectionIndexingFilter.java
A src/plugin/subcollection/README.txt
A src/plugin/subcollection/plugin.xml
A src/plugin/subcollection/build.xml
A src/plugin/index-more
A src/plugin/index-more/ivy.xml
A src/plugin/index-more/src
A src/plugin/index-more/src/test
A src/plugin/index-more/src/test/org
A src/plugin/index-more/src/test/org/apache
A src/plugin/index-more/src/test/org/apache/nutch
A src/plugin/index-more/src/test/org/apache/nutch/indexer
A src/plugin/index-more/src/test/org/apache/nutch/indexer/more
A 
src/plugin/index-more/src/test/org/apache/nutch/indexer/more/TestMoreIndexingFilter.java
A src/plugin/index-more/src/java
A src/plugin/index-more/src/java/org
A src/plugin/index-more/src/java/org/apache
A src/plugin/index-more/src/java/org/apache/nutch
A src/plugin/index-more/src/java/org/apache/nutch/indexer
A src/plugin/index-more/src/java/org/apache/nutch/indexer/more
A 
src/plugin/index-more/src/java/org/apache/nutch/indexer/more/MoreIndexingFilter.java
A 
src/plugin/index-more/src/java/org/apache/nutch/indexer/more/package.html
A src/plugin/index-more/plugin.xml
A src/plugin/index-more/build.xml
AUsrc/plugin/plugin.dtd
A src/plugin/parse-ext
A src/plugin/parse-ext/ivy.xml
A src/plugin/parse-ext/src
A src/plugin/parse-ext/src/test
A src/plugin/parse-ext/src/test/org
A src/plugin/parse-ext/src/test/org/apache
A src/plugin/parse-ext/src/test/org/apache/nutch
A src/plugin/parse-ext/src/test/org/apache/nutch/parse
A src/plugin/parse-ext/src/test/org/apache/nutch/parse/ext
A 
src/plugin/parse-ext/src/test/org/apache/nutch/parse/ext/TestExtParser.java
A src/plugin/parse-ext/src/java
A src/plugin/parse-ext/src/java/org
A src/plugin/parse-ext/src/java/org/apache
A src/plugin/parse-ext/src/java/org/apache/nutch
A src/plugin/parse-ext/src/java/org/apache/nutch/parse
A src/plugin/parse-ext/src/java/org/apache/nutch/parse/ext
A 
src/plugin/parse-ext/src/java/org/apache/nutch/parse/ext/ExtParser.java
A src/plugin/parse-ext/plugin.xml
A src/plugin/parse-ext/build.xml
A src/plugin/parse-ext/command
A src/plugin/urlnormalizer-pass
A src/plugin/urlnormalizer-pass/ivy.xml
A src/plugin/urlnormalizer-pass/src
A src/plugin/urlnormalizer-pass/src/test
A src/plugin/urlnormalizer-pass/src/test/org
A src/plugin/urlnormalizer-pass/src/test/org/apache
A src/plugin/urlnormalizer-pass/src/test/org/apache/nutch
A src/plugin/urlnormalizer-pass/src/test/org/apache/nutch/net
A 
src/plugin/urlnormalizer-pass/src/test/org/apache/nutch/net/urlnormalizer
A 
src/plugin/urlnormalizer-pass/src/test/org/apache/nutch/net/urlnormalizer/pass
AU
src/plugin/urlnormalizer-pass/src/test/org/apache/nutch/net/urlnormalizer/pass/TestPassURLNormalizer.java
A src/plugin/urlnormalizer-pass/src/java
A src/plugin/urlnormalizer-pass/src/java/org
A src/plugin/urlnormalizer-pass/src/java/org/apache
A src/plugin/urlnormalizer-pass/src/java/org/apache/nutch
A src/plugin/urlnormalizer-pass/src/java/org/apache/nutch/net
A 
src/plugin/urlnormalizer-pass/src/java/org/apache/nutch/net/urlnormalizer
A 
src/plugin/urlnormalizer-pass/src/java/org/apache/nutch/net/urlnormalizer/pass
AU
src/plugin/urlnormalizer-pass/src/java/org/apache/nutch/net/urlnormalizer/pass/PassURLNormalizer.java
AUsrc/plugin/urlnormalizer-pass/plugin.xml
AUsrc/plugin/urlnormalizer-pass/build.xml
A src/plugin/parse-html
A src/plugin/parse-html/ivy.xml
A src/plugin/parse-html/lib
A src/plugin/parse-html/lib/tagsoup.LICENSE.txt
A src/plugin/parse-html/src
A src/plugin/parse-html/src/test
A src/plugin/parse-html/src/test/org
A src/plugin/parse-html/src/test/org/apache
A src/plugin/parse-html/src/test/org/apache/nutch
A src/plugin/parse-html/src/test/org/apache/nutch/parse
A src/plugin/parse-html/src/test/org/apache/nutch/parse/html
A 
src/plugin/parse-html/src/test/org/apache/nutch/parse/html/TestRobotsMetaProcessor.java
A 
src/plugin/parse-html/src/test/org/apache/nutch/parse/html/TestDOMContentUtils.java
A src/plugin/parse-html/src/java
A src/plugin/parse-html/src/java/org
A src/plugin/parse-