Re: Pending Commits for Nutch Issues

2008-12-02 Thread John Martyniak
Is NUTCH-442 going to be part of the 1.0 release?  I hope so, Nutch/ 
Solr integration would be a huge.


just my .02 cents.

-John

On Nov 27, 2008, at 12:10 PM, Doğacan Güney wrote:

And here is a list of issues from me that needs more discussion/ 
review:


NUTCH-442 - Integrate Nutch/Solr: If NUTCH-442 is too complex to
review for people, for now we can just write a SolrIndexer like Sami
Siren's and deal with 442 after 1.0. I would be happy to provide such
a patch.

NUTCH-631 - MoreIndexingFilter fails with NoSuchElementException: I
don't know how to fix this one but indexing almost always fails with
index-more enabled.

NUTCH-652 - AdaptiveFetchSchedule#setFetchSchedule doesn't calculate
fetch interval correctly: I botched it once so now I am afraid to
commit it :D

NUTCH-626 - fetcher2 breaks out the domain with
db.ignore.external.links set at cross domain redirects: I am going to
update the patch and commit it if no objections.

Also, I think NUTCH-658 would be a nice feature for 1.0.

There are some others but these are the most recent and we really
should push 1.0 out the door already :D

Oh and finally we should do a review of all libraries in nutch
(libraries in plugins included) and update them to latest versions. I
am going to open an issue with the intenton of updating all the
libraries that do not require code changes.

--
Doğacan Güney




Re: site: operator with no query term

2009-03-03 Thread John Martyniak

Frank,

I don't know what the timing on completing something like this is, but  
this would be a nice feature to have in 1.0, if that is even possible  
at this time.


-John

On Mar 3, 2009, at 5:19 PM, Otis Gospodnetic wrote:



Absolutely!  I see you are at home with JIRA, so I don't have to  
ask. :)


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 

From: Frank McCown 
To: nutch-dev@lucene.apache.org
Sent: Tuesday, March 3, 2009 9:39:24 AM
Subject: site: operator with no query term

Google, Yahoo, and Live list all pages they have indexed for the
"site:www.example.com" query.  But Nutch returns back 0 results  
unless

a query term is also supplied (e.g., "site:www.example.com term").
Would it be better for Nutch to respond in the same manner that other
search engines do?  This is a change I'd be willing to tackle.

Frank