I was wrong. It wasn't because of the "read once free" policy. I tried again 
with Java first again and this time it didn't work. 
I looked up google and found the Http Client you mentioned. It is the one 
provided by apache, right? I guess I will have to try that one now. Thanks!

From mp2893's iPhone

On 2010. 12. 10., at 오전 11:33, Hemanth Yamijala <yhema...@gmail.com> wrote:

> Not exactly what you may want - but could you try using a HTTP client
> in Java ? Some of them have the ability to automatically follow
> redirects, manage cookies etc.
> 
> Thanks
> hemanth
> 
> On Thu, Dec 9, 2010 at 4:35 PM, edward choi <mp2...@gmail.com> wrote:
>> Excuse me for asking a general Java question here.
>> I tried to find Java mailing list from Google but none of them were active.
>> 
>> There is a problem that's been driving me crazy for a while.
>> 
>> I am trying to download webpages from New York Times.
>> With Java URL.openStream(), I can't get past the login requirement.
>> But with c++ socket programming (using read() and write()), I can download
>> any webpage just fine.
>> 
>> Interesting thing is that with c++, I get redirected like 10 times. Below is
>> the content of the header of the firstly redirected webpage when I try to
>> download
>> "
>> http://www.nytimes.com/glogin?URI=http://www.nytimes.com:80/2010/12/09/world/asia/09military.html&OQ=_rQ3D1Q26hp&OP=47c049a1Q2FVQ3EY6VQ5Dks9akk5Q27VQ27Q2AFQ2AVFQ27VQ2AtVQ3EkahQ5DVQ3F9Q5CQ3FVQ2AtbQ5ChQ5C5Q3Fa!Q2BN5bh
>> "
>> 
>> HTTP/1.1 302 Moved Temporarily
>> Server: Sun-ONE-Web-Server/6.1
>> Date: Thu, 09 Dec 2010 08:42:35 GMT
>> Content-type: text/html
>> Set-cookie: RMID=0b5d4aea392d4d00967bfaf1; expires=Friday, 09-Dec-2011
>> 08:42:35 GMT; path=/; domain=.nytimes.com
>> Set-cookie: NYT_GR=4d009b2b-yJ4V047ooAmPtGcvASTmng; path=/; domain=.
>> nytimes.com
>> Set-cookie:
>> NYT-S=0Mzh9PJwQ663rDXrmvxADeHJOGvJvXmRaJdeFz9JchiAJK89nlVaR7bsV.Ynx4rkFI;
>> expires=Saturday, 08-Jan-2011 08:42:35 GMT; path=/; domain=.nytimes.com
>> Set-cookie: NYT-Pref=hppznw|^creator|NYTD.Cookies; path=/; domain=.
>> nytimes.com
>> Location:
>> http://www.nytimes.com:80/2010/12/09/world/asia/09military.html?_r=1&hp
>> Expires: Thu, 01 Dec 1994 16:00:00 GMT
>> Cache-control: no-cache
>> Pragma: no-cache
>> Connection: close
>> 
>> But with Java, I get redirected only once to a https:// webpage and it's a
>> dead end. Below is the result of java.net.URLConnection.getHeaderFiles()
>> 
>> HTTP/1.1 301 Moved Permanently,
>> Date: Thu, 09 Dec 2010 10:50:53 GMT,
>> Content-type: text/html,
>> Content-length: 0,
>> Location:
>> https://myaccount.nytimes.com/auth/login?URI=/2010/12/09/world/asia/09military.html&OQ=_rQ3D5Q26hp&REFUSE_COOKIE_ERROR=SHOW_ERROR
>> ,
>> Server: Sun-ONE-Web-Server/6.1,
>> 
>> There is a clear difference between the two. I don't know why and it's been
>> driving me crazy.
>> My guess is that c++ write() function can create some kind of cookie by
>> itself, but Java URL.openStream() can't.
>> 
>> Am I right? Or can anyone explain this for me?
>> 

Reply via email to