Thanks for your reply to my message.
  
It might be that the server is giving a 30x Redirect response? Which wget
automatically follows, but get-pure-port by default does not.

Does it help if you supply a non-zero value like say 10 for the optional
#:redirections argument to get-pure-port?

  Yes, I saw that mentioned during my Web search.  It turns out not to matter:

    $ cat d.rkt
    #lang racket

    (require net/url)

    (define library-url (string->url
      "http://example.com/search?/ftlist^bib18%2C1%2C0%2C2703/mode=2";))

    (call/input-url
      library-url
      (lambda (u) (get-pure-port u #:redirections 10))
      port->string)

    $ mzscheme --eval '(require "d.rkt")' > rr.html

    $ diff r.html rr.html

  But maybe I don't have the right number of redirects.  Approaching the
  question from the other direction:
    
    $ wget --max-redirect=0 -O ww.html --quiet 
"http://example.com/search?/ftlist^bib18%2C1%2C0%2C2703/mode=2";

    $ diff w.html ww.html

    $

If it still isn't working, you could try using get-impure-port, calling
purify-port to get the headers, and seeing what they say.

  Ah, ok:

    $ cat d.rkt
    #lang racket

    (require net/url)

    (define library-url (string->url
      "http://example.com/search?/ftlist^bib18%2C1%2C0%2C2703/mode=2";))

    (define port (get-impure-port library-url))

    (purify-port port)

    $ mzscheme --eval '(require "d.rkt")' | sed -e 's/\\r\\n/~/g' | tr '~' 
'\012'
    "HTTP/1.1 200 OK
    Date: Fri, 11 Sep 2015 01:10:39 GMT
    Server: III 100
    Pragma: no-cache
    Content-Type: text/html; ISO-8859-1
    Expires: Fri, 11 Sep 2015 01:10:39 GMT
    Cache-control: no-cache
    Set-Cookie: SESSION_LANGUAGE=eng; path=/; domain=.example.com; path=/
    Set-Cookie: III_EXPT_FILE=aa20885; path=/; domain=.example.com; path=/
    Set-Cookie: III_SESSION_ID=3ba8ef1c57601a925bc732a44b4bb791; 
domain=.example.com; path=/
    Set-Cookie: SESSION_SCOPE=0; path=/
    Vary: User-Agent,Accept-Encoding
    Transfer-Encoding: chunked

    "

    $

  To the extent I understand these things (just barely), the header looks ok.

  I futzed around with other urls (google, reddit/programming) and was able to
  get roughly the same pages back from racket and wget.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to