[R] string size limits in RCurl

2013-04-24 Thread Elmore, Ryan
Hi All,

I am running into what appears to be character size limit in a JSON string when 
trying retrieve data from either `curlPerform()` or `getURL()`.  Here is 
non-reproducible code [1], but it should shed some light on the problem.

# Note that .base.url is the basic url for the API, q is a query, user
#  is specified, etc.
session = getCurlHandle()
curl.opts <- list(userpwd = paste(user, ":", key, sep = ""),
  httpheader = "Content-Type: application/json")
request <- paste(.base.url, q, sep = "")
txt <- getURL(url = request, curl = session, .opts = curl.opts,
  write = basicTextGatherer())

or

r = dynCurlReader()
curlPerform(url = request, writefunction = r$update, curl = session,
.opts = curl.opts)

My guess is that the `update` or `value` functions in the `basicTextGather` or 
`dynCurlReader` text handler objects are having trouble with the large strings. 
 In this example, `r$value()` will return a truncated string that is 
approximately 2 MB.  The code given above will work fine for queries < 2 MB.

Note that I can easily do the following from the command line (or using 
`system()` in R), but writing to disc seems like a waste if I am doing the 
subsequent analysis in R.

curl -v --header "Content-Type: application/json" --user 
username:register:passwd 
https://base.url.for.api/getdata/select+*+from+sometable > stream.json

where `file.json` is a roughly 14MB json string. I can read the string into R 
using either

con <- file(paste(.project.path, "data/stream.json", sep = ""), "r")
string <- readLines(con)

or directly to list as

tmp <- fromJSON(file = paste(.project.path, "data/stream.json", sep = ""))

Any thoughts are very much appreciated.  Note that I posted this same 
question/comment to StackOverflow and will happily provide any helpful 
suggestions to that list as well.

Ryan

[1] - Sorry for not providing reproducible code, but I'm dealing with a govt 
firewall.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] RCurl and postForm()

2011-04-29 Thread Elmore, Ryan
Hi everybody,

I think that I am missing something fundamental in how strings are passed from 
a postForm() call in R to the curl or libcurl functions underneath.  For 
example, I can do the following using curl from the command line:

$ curl -d "Archbishop Huxley" "http://www.datasciencetoolkit.org/text2people";
[{"gender":"u","first_name":"","title":"archbishop","surnames":"Huxley","start_index":0,"end_index":17,"matched_string":"Archbishop
 Huxley"}]

Trying the same thing, or what I *think* is the same thing (obvious not) in R 
(Mac OS 10.6.7, R 2.13.0) produces:

> library(RCurl)
Loading required package: bitops
> api <- "http://www.datasciencetoolkit.org/text2people";
> postForm(api, a="Archbishop Huxley")
[1] 
"[{\"gender\":\"u\",\"first_name\":\"\",\"title\":\"archbishop\",\"surnames\":\"Huxley\",\"start_index\":44,\"end_index\":61,\"matched_string\":\"Archbishop
 
Huxley\"},{\"gender\":\"u\",\"first_name\":\"\",\"title\":\"archbishop\",\"surnames\":\"Huxley\",\"start_index\":88,\"end_index\":105,\"matched_string\":\"Archbishop
 Huxley\"}]"
attr(,"Content-Type")
charset
"text/html" "utf-8"

I can match the result given on the DSTK API's website by using system(), but 
doesn't seem like the R-like way of doing something.

> system("curl -d 'Archbishop Huxley' 
> 'http://www.datasciencetoolkit.org/text2people'")
158   141  141   141
0[{"gender":"u","first_name":"","title":"archbishop","surnames":"Huxley","start_index":0,"end_index":17,"matched_string":"Archbishop
 Huxley"}]17599 72 --:--:-- --:--:-- --:--:--   670

If you want to see some additional information related to this question, I 
posted on StackOverflow a few days ago:
http://stackoverflow.com/questions/5797688/post-request-using-rcurl

I am working on this R wrapper for the data science toolkit as a way of 
illustrating how to make an R package for the Denver RUG and ran into this 
problem.  Any help to this problem will be greatly appreciated by the Denver 
RUG!

Cheers,
Ryan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.