[jira] [Commented] (NUTCH-827) HTTP POST Authentication

2016-06-14 Thread Steve Yao (JIRA)
t; HTTP POST Authentication > > > Key: NUTCH-827 > URL: https://issues.apache.org/jira/browse/NUTCH-827 > Project: Nutch > Issue Type: New Feature > Components: protocol >A

[jira] [Commented] (NUTCH-827) HTTP POST Authentication

2016-06-13 Thread Sebastian Nagel (JIRA)
new Jira for this problem? Thanks! > HTTP POST Authentication > > > Key: NUTCH-827 > URL: https://issues.apache.org/jira/browse/NUTCH-827 > Project: Nutch > Issue Type: New Feature >

[jira] [Commented] (NUTCH-827) HTTP POST Authentication

2016-06-10 Thread Steve Yao (JIRA)
d in HTTP.java > HTTP POST Authentication > > > Key: NUTCH-827 > URL: https://issues.apache.org/jira/browse/NUTCH-827 > Project: Nutch > Issue Type: New Feature > Components: protoco

[jira] [Commented] (NUTCH-827) HTTP POST Authentication

2016-06-05 Thread Steve Yao (JIRA)
Default(new CookieManager()); // And the cookies policy could be changed here... String pageContent = httpGetPageContent(authConfigurer.getLoginUrl()); List params = getLoginFormParams(pageContent); sendPost(authConfigurer.getLoginUrl(), par

[jira] [Updated] (NUTCH-1940) Port HTTP POST Authentication to 2.X

2015-07-01 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1940: Assignee: Talat UYARER > Port HTTP POST Authentication to

[jira] [Commented] (NUTCH-1940) Port HTTP POST Authentication to 2.X

2015-07-01 Thread Lewis John McGibbney (JIRA)
Nice work Talat. > Port HTTP POST Authentication to 2.X > > > Key: NUTCH-1940 > URL: https://issues.apache.org/jira/browse/NUTCH-1940 > Project: Nutch > Issue Type: New Feature >

[jira] [Updated] (NUTCH-1940) Port HTTP POST Authentication to 2.X

2015-07-01 Thread Talat UYARER (JIRA)
server uses compressed http connection. The original patch can not read content. I add a method that name is getResponseBody. > Port HTTP POST Authentication to 2.X > > > Key: NUTCH-1940 > URL: https://issues.apache.

Re: HTTP Post Authentication

2015-04-07 Thread Tyler Palsulich
I don't think this is what I was running into. I cannot replicate this error using Nutch trunk. The only thing that stood out to me about the xml config file was the lack of "" on the first line. But, I'm not sure that would actually make a difference. You can see https://github.com/tpalsulich/nut

Re: HTTP Post Authentication

2015-04-07 Thread feng lu
On Tue, Apr 7, 2015 at 8:11 PM, Tizy Ninan wrote: > NutchCrawler-1.0-SNAPSHOT.jar! Maybe your configuration format is correct and before you missing the tag of auth-configuration. But I find you still use 1.0-SNAPSHOT and you can try the latest trunk version for Nutch at https://github.com/apac

Re: HTTP Post Authentication

2015-04-07 Thread Mattmann, Chris A (3980)
, Los Angeles, CA 90089 USA ++ -Original Message- From: Tizy Ninan Reply-To: "dev@nutch.apache.org" Date: Tuesday, April 7, 2015 at 5:11 AM To: "u...@nutch.apache.org" Cc: "dev@nutch.apache.org

Re: HTTP Post Authentication

2015-04-07 Thread Tizy Ninan
Hi, I am still not able to crawl websites requiring authentication. The version of Nutch used is 1.10. While crawling I am getting the following warnings and still not able to identify what is going wrong. Please find the httpclient-auth.xml file in the following link. https://gist.github.com/ti

[jira] [Updated] (NUTCH-827) HTTP POST Authentication

2015-04-06 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-827: --- Labels: authentication memex (was: authentication) > HTTP POST Authenticat

HTTP POST Authentication

2015-03-27 Thread Tyler Palsulich
Hi Folks, I'm having trouble getting HTTP POST authentication to work on mrs.org. I have a valid username and password, but when I try to run parsechecker a page that requires authentication (http://mrs.org/myMRS/), I get a 500 error and the following response: Validation of viewstate MAC f

Re: HTTP Post Authentication

2015-03-18 Thread Mohammed Omer
Tizy, in order to help debug your error, you'll need to provide additional information. Check out this link for what's generally needed when trying to debug over chat/email: http://www.mikeash.com/getting_answers The error seems to say that httpclient.Http doesn't like the auth conf file you provi

Re: HTTP Post Authentication

2015-03-18 Thread Mohammed Omer
Edit: The first link should be https://www.mikeash.com/getting_answers.html Thank you, Mo On Wed, Mar 18, 2015 at 8:16 PM, Mohammed Omer wrote: > Tizy, in order to help debug your error, you'll need to provide additional > information. Check out this link for what's generally needed when tryin

Re: HTTP Post Authentication

2015-03-12 Thread Tizy Ninan
Hi Lewis, Thank you for the reply. I tried by providing the parameters specified in the httpclient-auth.xml template file. But while crawling I am getting the following warnings. WARN httpclient.Http: Bad auth conf file: root element found in httpclient-auth.xml - must be WARN httpclient.Http:

Re: HTTP Post Authentication

2015-03-12 Thread Sebastian Nagel
Hi Tizy, this should help: https://wiki.apache.org/nutch/HttpPostAuthentication http://svn.apache.org/repos/asf/nutch/trunk/conf/httpclient-auth.xml.template For more details you could also check https://issues.apache.org/jira/browse/NUTCH-827 https://issues.apache.org/jira/browse/NUTCH-1943 Che

HTTP Post Authentication

2015-03-12 Thread Tizy Ninan
Hi, Is there any detailed step by step explanation on how to implement HTTPPostAuthentication on Nutch 1.10.? Thanks and Regards, Tizy

[jira] [Commented] (NUTCH-827) HTTP POST Authentication

2015-02-17 Thread Tyler Palsulich (JIRA)
t. Thanks! > HTTP POST Authentication > > > Key: NUTCH-827 > URL: https://issues.apache.org/jira/browse/NUTCH-827 > Project: Nutch > Issue Type: New Feature > Components: protocol >

[jira] [Commented] (NUTCH-827) HTTP POST Authentication

2015-02-17 Thread Sebastian Nagel (JIRA)
case there are local changes). > HTTP POST Authentication > > > Key: NUTCH-827 > URL: https://issues.apache.org/jira/browse/NUTCH-827 > Project: Nutch > Issue Type: New Feature >

[jira] [Commented] (NUTCH-827) HTTP POST Authentication

2015-02-16 Thread Tyler Palsulich (JIRA)
sn't included in the commit. > HTTP POST Authentication > > > Key: NUTCH-827 > URL: https://issues.apache.org/jira/browse/NUTCH-827 > Project: Nutch > Issue Type: New Feature >

[jira] [Commented] (NUTCH-827) HTTP POST Authentication

2015-02-13 Thread Hudson (JIRA)
ttps://builds.apache.org/job/Nutch-trunk/2976/]) NUTCH-827 HTTP POST Authentication (lewismc: http://svn.apache.org/viewvc/nutch/trunk/?view=rev&rev=1659701) * /nutch/trunk/src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/HttpFormAuthConfigurer.java * /nutch/trunk/sr

[jira] [Commented] (NUTCH-827) HTTP POST Authentication

2015-02-13 Thread Lewis John McGibbney (JIRA)
tted @revision 1659701 > HTTP POST Authentication > > > Key: NUTCH-827 > URL: https://issues.apache.org/jira/browse/NUTCH-827 > Project: Nutch > Issue Type: New Feature > Components: protoco

[jira] [Resolved] (NUTCH-827) HTTP POST Authentication

2015-02-13 Thread Lewis John McGibbney (JIRA)
involved. All credited in CHANGES > HTTP POST Authentication > > > Key: NUTCH-827 > URL: https://issues.apache.org/jira/browse/NUTCH-827 > Project: Nutch > Issue Type: New Feature > Component

[jira] [Commented] (NUTCH-827) HTTP POST Authentication

2015-02-13 Thread Lewis John McGibbney (JIRA)
will commit this patch and log an issue to accommodate and address your final suggestion (and an excellent one it is too!). Thanks Seb. > HTTP POST Authentication > > > Key: NUTCH-827 > URL: https://issues.apache.org/ji

[jira] [Updated] (NUTCH-827) HTTP POST Authentication

2015-02-13 Thread Sebastian Nagel (JIRA)
obal and ignores {{}}. So you have to restrict your crawl to the form authentication pages only. Ideally, also form authentication should be bound to a scope (one host, one URL prefix, etc.) same as HTTP authentication. > HTTP POST Authentication > >

[jira] [Commented] (NUTCH-827) HTTP POST Authentication

2015-02-13 Thread Lewis John McGibbney (JIRA)
get in to 1.10 > HTTP POST Authentication > > > Key: NUTCH-827 > URL: https://issues.apache.org/jira/browse/NUTCH-827 > Project: Nutch > Issue Type: New Feature > Components: protoco

[jira] [Created] (NUTCH-1940) Port HTTP POST Authentication to 2.X

2015-02-10 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created NUTCH-1940: --- Summary: Port HTTP POST Authentication to 2.X Key: NUTCH-1940 URL: https://issues.apache.org/jira/browse/NUTCH-1940 Project: Nutch Issue Type

[jira] [Updated] (NUTCH-1940) Port HTTP POST Authentication to 2.X

2015-02-10 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1940: Issue Type: New Feature (was: Bug) > Port HTTP POST Authentication to

[jira] [Updated] (NUTCH-827) HTTP POST Authentication

2015-02-10 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-827: --- Fix Version/s: (was: 2.4) > HTTP POST Authenticat

[jira] [Updated] (NUTCH-827) HTTP POST Authentication

2015-02-09 Thread Lewis John McGibbney (JIRA)
LOGGER.debug("'name' attribute for form element is also null."); throw new IllegalArgumentException("No form exists: " + authConfigurer.getLoginFormId()); } } {code} The rest seem to be OK to me and I am able to use this patch to f

[jira] [Work stopped] (NUTCH-827) HTTP POST Authentication

2015-02-09 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-827 stopped by Lewis John McGibbney. -- > HTTP POST Authenticat

[jira] [Commented] (NUTCH-827) HTTP POST Authentication

2015-02-06 Thread Lewis John McGibbney (JIRA)
.bq log level TRACE should provide sufficient information what goes wrong when logging in +1 bq. config file to be committed should be conf/httpclient-auth.xml.template instead of conf/httpclient-auth.xml +1, patch coming up Thanks for review > HTTP POST Authentication > ---

[jira] [Commented] (NUTCH-827) HTTP POST Authentication

2015-02-06 Thread Sebastian Nagel (JIRA)
de/docs/Web/HTML/Element/form#Attributes]]). I'll continue this trial to provide a fix/work-around. - log level TRACE should provide sufficient information what goes wrong when logging in - config file to be committed should be {{conf/httpclient-auth.xml.template}} instead of {{conf/httpclient-auth

[jira] [Updated] (NUTCH-827) HTTP POST Authentication

2015-02-04 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-827: --- Fix Version/s: (was: 1.11) 1.10 > HTTP POST Authenticat

[jira] [Work started] (NUTCH-827) HTTP POST Authentication

2015-02-03 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-827 started by Lewis John McGibbney. -- > HTTP POST Authenticat

[jira] [Updated] (NUTCH-827) HTTP POST Authentication

2015-02-03 Thread Lewis John McGibbney (JIRA)
to enable access to various large Databases requiring HTTP Post authentication. I also would like to mention that setting the redirect boolean flag to true is usually always required. Would really appreciate if folks could try this out and comment. > HTTP POST Authenticat

[jira] [Commented] (NUTCH-827) HTTP POST Authentication

2015-01-31 Thread Lewis John McGibbney (JIRA)
as I require form-based authentication for a current research task. > HTTP POST Authentication > > > Key: NUTCH-827 > URL: https://issues.apache.org/jira/browse/NUTCH-827 > Project: Nutch >

[jira] [Updated] (NUTCH-827) HTTP POST Authentication

2015-01-31 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-827: --- Fix Version/s: 2.4 > HTTP POST Authenticat

[jira] [Assigned] (NUTCH-827) HTTP POST Authentication

2015-01-31 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney reassigned NUTCH-827: -- Assignee: Lewis John McGibbney > HTTP POST Authenticat

[jira] [Updated] (NUTCH-827) HTTP POST Authentication

2014-04-05 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Nioche updated NUTCH-827: Component/s: (was: fetcher) protocol > HTTP POST Authenticat

[jira] [Commented] (NUTCH-827) HTTP POST Authentication

2014-03-03 Thread yuanyun.cn (JIRA)
l: not protocol-http. If you are interested, you may read:http://lifelongprogrammer.blogspot.com/2014/02/part1-using-apache-http-client-to-do-http-post-form-authentication.html > HTTP POST Authentication > > > Key: NUTCH-827 >

[jira] [Updated] (NUTCH-827) HTTP POST Authentication

2014-03-03 Thread yuanyun.cn (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuanyun.cn updated NUTCH-827: - Attachment: http-client-form-authtication.patch > HTTP POST Authenticat

[jira] [Updated] (NUTCH-827) HTTP POST Authentication

2013-05-22 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-827: -- Fix Version/s: 1.8 > HTTP POST Authenticat

[jira] [Updated] (NUTCH-827) HTTP POST Authentication

2013-01-15 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-827: --- Fix Version/s: 2.2 > HTTP POST Authenticat

[jira] [Commented] (NUTCH-827) HTTP POST Authentication

2012-10-17 Thread Max Dzyuba (JIRA)
Thanks for the help! > HTTP POST Authentication > > > Key: NUTCH-827 > URL: https://issues.apache.org/jira/browse/NUTCH-827 > Project: Nutch > Issue Type: New Feature > Components: fe

[jira] [Commented] (NUTCH-827) HTTP POST Authentication

2012-10-08 Thread Jasper van Veghel (JIRA)
"url: " + url + +"; status code: " + code + +"; cookies received: " + Http.getClient().getState().getCookies().length); {code} If you turn on TRACE logging, you should see messages like that.

[jira] [Commented] (NUTCH-827) HTTP POST Authentication

2012-10-08 Thread Max Dzyuba (JIRA)
Nutch and stored as intended? Thanks, Max > HTTP POST Authentication > > > Key: NUTCH-827 > URL: https://issues.apache.org/jira/browse/NUTCH-827 > Project: Nutch > Issue Type: Ne

[jira] [Commented] (NUTCH-827) HTTP POST Authentication

2012-10-03 Thread Jasper van Veghel (JIRA)
hink that we ended up solving that simply by removing .. {code} +method.setFollowRedirects(followRedirects); {code} As redirects are not supported for POST-requests. > HTTP POST Authentication > > > Key: NUTCH-827 >

[jira] [Commented] (NUTCH-827) HTTP POST Authentication

2012-10-01 Thread Max Dzyuba (JIRA)
this... Thanks for your time! Max > HTTP POST Authentication > > > Key: NUTCH-827 > URL: https://issues.apache.org/jira/browse/NUTCH-827 > Project: Nutch > Issue Type: New Feature >

[jira] [Comment Edited] (NUTCH-827) HTTP POST Authentication

2012-10-01 Thread Jasper van Veghel (JIRA)
d authentication failed; cookies will not be present for this request but an attempt to retrieve them will be made for the next one.", e); To see where the Exception is coming from. All it does after that LOG.error() is release the connection. So it shouldn't be throwing an Excepti

[jira] [Comment Edited] (NUTCH-827) HTTP POST Authentication

2012-10-01 Thread Jasper van Veghel (JIRA)
t for this request but an attempt to retrieve them will be made for the next one.", e); To see where the Exception is coming from. All it does after that LOG.error() is release the connection. So it shouldn't be throwing an Exception. > HTTP POST Authentication >

[jira] [Commented] (NUTCH-827) HTTP POST Authentication

2012-10-01 Thread Jasper van Veghel (JIRA)
be made for the next one.", e); To see where the Exception is coming from. All it does after that LOG.error() is release the connection. So it shouldn't be throwing an Exception. > HTTP POST Authentication > > > Key:

[jira] [Commented] (NUTCH-827) HTTP POST Authentication

2012-10-01 Thread Max Dzyuba (JIRA)
made for the next one. 2012-10-01 13:11:24,682 ERROR httpclient.Http - Unable to retrieve login page; code = 200 The second line with response code 200 is what I don't understand. I'd appreciate any tips you could give in this regard. Thanks, Max > HT

[jira] [Commented] (NUTCH-827) HTTP POST Authentication

2012-10-01 Thread Max Dzyuba (JIRA)
ave to go that way then. Best regards, Max > HTTP POST Authentication > > > Key: NUTCH-827 > URL: https://issues.apache.org/jira/browse/NUTCH-827 > Project: Nutch > Issue Type: New

[jira] [Commented] (NUTCH-827) HTTP POST Authentication

2012-10-01 Thread Jasper van Veghel (JIRA)
ookie, and then returns some specific piece of data only when that cookie is set? Good luck! Jasper > HTTP POST Authentication > > > Key: NUTCH-827 > URL: https://issues.apache.org/jira/browse/NUTCH-827 >

[jira] [Commented] (NUTCH-827) HTTP POST Authentication

2012-10-01 Thread Max Dzyuba (JIRA)
n! I applied the patch and compiled Nutch just fine, but can't confirm that it is working. Can you point to a website that this patch worked to pass the form auth at? I need to verify that it is working for me, but can't at the moment. Thanks in advance, Max > HTTP PO

[jira] [Commented] (NUTCH-827) HTTP POST Authentication

2012-09-27 Thread Jasper van Veghel (JIRA)
jar which is used for each subsequent request. This isn't exactly a fool-proof solution (what if other requests generate expired cookies? what if the login fails? etc.), but for the project for which I wrote the patch, it suited our needs. Hope it helps! > HTTP POST

[jira] [Commented] (NUTCH-827) HTTP POST Authentication

2012-09-24 Thread Max Dzyuba (JIRA)
se answer Ian's question? I have a similar problem figuring out how exactly I can provide the username and password. Thank you! > HTTP POST Authentication > > > Key: NUTCH-827 > URL: https://issues.apa

[jira] [Updated] (NUTCH-827) HTTP POST Authentication

2012-04-03 Thread Markus Jelsma (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-827: Fix Version/s: (was: 1.5) 1.6 20120304-push-1.6 > HTTP P

[jira] [Updated] (NUTCH-827) HTTP POST Authentication

2012-01-06 Thread Markus Jelsma (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-827: Fix Version/s: 1.5 > HTTP POST Authenticat

[jira] [Commented] (NUTCH-827) HTTP POST Authentication

2012-01-05 Thread Ian Piper (Commented) (JIRA)
and password need to go? I presently have these in runtime/local/conf/httpclient-auth.xml, but this doesn't seem to work. Also, what url needs to go in the nutch-site.xml file? > HTTP POST Authentication > > >

[jira] Updated: (NUTCH-827) HTTP POST Authentication

2010-05-27 Thread Jasper van Veghel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jasper van Veghel updated NUTCH-827: Attachment: nutch-http-cookies.patch > HTTP POST Authenticat

[jira] Created: (NUTCH-827) HTTP POST Authentication

2010-05-27 Thread Jasper van Veghel (JIRA)
HTTP POST Authentication Key: NUTCH-827 URL: https://issues.apache.org/jira/browse/NUTCH-827 Project: Nutch Issue Type: New Feature Components: fetcher Affects Versions: 1.1, 2.0 Reporter