Appending to lists is only very slightly more efficient than incremental 
rbinding. Ideally you can figure out an upper bound for number of records, 
preallocate a data frame of that size, modify each element as you go in-place, 
and shrink the data frame once at the end as needed. If you cannot do that, you 
can append fixed size data frames and follow the same strategy in chunks with a 
single do.call/rbind at the end. 

Note that reproducible examples including example data often yield working 
code, while incomplete examples tend to yield handwaving descriptions like the 
above. 

I will note that any code placed after a return function is useless. I highly 
recommend avoiding the return function like the plague... use the 
expression-at-the-end-of-the-function method of returning.
-- 
Sent from my phone. Please excuse my brevity.

On September 17, 2016 7:10:05 AM PDT, Ismail SEZEN <sezenism...@gmail.com> 
wrote:
>I suspect that rbind is responsible. Use list and append instead of
>rbind. At the end, combine elements of list by do.call(“rbind”, list).
>
>> On 17 Sep 2016, at 15:05, Philippe de Rochambeau <phi...@free.fr>
>wrote:
>> 
>> Hello,
>> the following function, which stores numeric values extracted from a
>binary file, into an R matrix, is very slow, especially when the said
>file is several MB in size.
>> Should I rewrite the function in inline C or in C/C++ using Rcpp? If
>the latter case is true, how do you « readBin »  in Rcpp (I’m a total
>Rcpp newbie)?
>> Many thanks.
>> Best regards,
>> phiroc
>> 
>> 
>> -------------
>> 
>> # inputPath is something like
>http://myintranet/getData?pathToFile=/usr/lib/xxx/yyy/data.bin
><http://myintranet/getData?pathToFile=/usr/lib/xxx/yyy/data.bin>
>> 
>> PLTreader <- function(inputPath){
>>      URL <- file(inputPath, "rb")
>>      PLT <- matrix(nrow=0, ncol=6)
>>      compteurDePrints = 0
>>      compteurDeLignes <- 0
>>      maxiPrints = 5
>>      displayData <- FALSE
>>      while (TRUE) {
>>              periodIndex <- readBin(URL, integer(), size=4, n=1,
>endian="little") # int (4 bytes)
>>              eventId <- readBin(URL, integer(), size=4, n=1, 
>> endian="little") #
>int (4 bytes)
>>              dword1 <- readBin(URL, integer(), size=4, signed=FALSE, n=1,
>endian="little") # int
>>              dword2 <- readBin(URL, integer(), size=4, signed=FALSE, n=1,
>endian="little") # int
>>              if (dword1 < 0) {
>>                      dword1 = dword1 + 2^32-1;
>>              }
>>              eventDate = (dword2*2^32 + dword1)/1000
>>              repNum <- readBin(URL, integer(), size=2, n=1, endian="little") 
>> #
>short (2 bytes)
>>              exp <- readBin(URL, numeric(), size=4, n=1, endian="little") #
>float (4 bytes, strangely enough, would expect 8)
>>              loss <- readBin(URL, numeric(), size=4, n=1, endian="little") #
>float (4 bytes)
>>              PLT <- rbind(PLT, c(periodIndex, eventId, eventDate, repNum, 
>> exp,
>loss))
>>      } # end while
>>      return(PLT)
>>      close(URL)
>> }
>> 
>> ----------------
>>      [[alternative HTML version deleted]]
>> 
>> ______________________________________________
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>______________________________________________
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to