Re: [R] Removing polygons from shapefile of Scotland and Islands

2024-05-14 Thread Jan van der Laan
I believe mapshaper has functionality for removing small 'islands'. 
There is a webinterface for mapshaper, but I see there is also an 
R-package (see 
https://search.r-project.org/CRAN/refmans/rmapshaper/html/ms_filter_islands.html 
for island removal).


If you want to manually select which islands to keep and which to 
remove, you can split multipolygons into single polygons. I believe that 
is possible using st_cast.


But if it is just getting the relevant portion of the map on screen. 
With the plot-command and using st_viewport it is possible to set the 
part of the map that is drawn.


HTH,
Jsn


On 14-05-2024 15:16, Nick Wray wrote:

Hello  I have a shapefile of Scotland, including the islands.  The river
flow data I am using is only for the mainland and for a clearer and larger
map I would like to not plot Orkney and Shetland to the north of the
mainland, as I don't need them.

The map I have I got from
https://borders.ukdataservice.ac.uk/easy_download_data.html?data=infuse_ctry_2011

then I put the uk shapefile onto my laptop with no problems (I have sf
running)

the_uk<-st_read(dsn="C:/Users/nickm/Desktop/Shapefiles/infuse_ctry_2011.shp")

scotland<-the_uk[2,]

plot(scotland$geometry)

This gives me a nice map of Scotland  plus islands but obviously there are
lots of separate polygons and if I go into the points with something like

scot_pts<-unlist(as.data.frame(scotland$geometry))

it's not at all clear how I can get rid of the points I don't want as they
don't seem to be listed in any easy way to find where one polygon stops and
another starts

I am wondering whether this approach is right anyway or whether there is
some sf function which would allow me to identify the polygons I want -
essentially the big one which is the mainland without lots of elaborate
conversions and manipulations

Any pointers, thoughts etc much appreciated

Thanks Nick Wray

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inquiry about bandwidth rescaling in Ksmooth

2023-10-27 Thread Jan Failenschmid via R-help
Dear Bert,

Thank you very much for your quick reply.
I have tested this, and it indeed appears to be the source of the discrepancy I 
observed.
My apologies for overlooking this in the documentation and thank you for 
clarifying.

Cheers,

Jan

From: Bert Gunter 
Sent: Thursday, 26 October 2023 20:19
To: Jan Failenschmid 
Cc: r-help@r-project.org 
Subject: Re: [R] Inquiry about bandwidth rescaling in Ksmooth

Apologies in advance if my comments don't help, in which case, no need
to respond,  but I noted in ?ksmooth:

"bandwidth
the bandwidth. The kernels are scaled so that their quartiles (viewed
as probability densities) are at ± 0.25*bandwidth." So, could this be
a source of the discrepancies you cited?

Given that ?ksmooth explicitly says:

"Note:
This function was implemented for compatibility with S, although it is
nowhere near as slow as the S function. Better kernel smoothers are
available in other packages such as KernSmooth."

One wonder why you bother with it at all? (That was rhetorical -- do
not answer).

Cheers,
Bert

On Thu, Oct 26, 2023 at 11:06 AM Jan Failenschmid via R-help
 wrote:
>
> Dear Sir, Madam, or to whom this may concern,
>
> my name is Jan Failenschmid and I am a Ph.D. student at Tilburg University.
> For my project I have been looking into different types of kernel regression 
> estimators and corresponding R functions.
> While comparing different functions I noticed that stats::ksmooth returned 
> different estimates for the same bandwidth
> as other kernel regression estimators that should be equivalent (i.e. the 
> local polynomial estimators KernSmooth::locpoly and
> locpol::locpol with degree 0). However, when optimizing the bandwidth of 
> ksmooth separately using the same loss function, I find comparable estimates 
> to the other two estimators for a (larger) different bandwidth. To confirm 
> this, I wrote my own Nadaraya-Watson kernel regression estimator, which is 
> consistent with the two local polynomial estimators and shows the same 
> discordance with ksmooth.
>
> This led me to the suspicion that the bandwidth that is passed to kmooth is 
> rescaled or transformed within the function. Unfortunately, I was not able to 
> confirm this with either the code of the function or the documentation. It 
> would be of great help to me if you could clarify this for me.
>
> Thank you very much for your time and help in advance.
>
> Kind regards,
>
> Jan Failenschmid
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Inquiry about bandwidth rescaling in Ksmooth

2023-10-26 Thread Jan Failenschmid via R-help
Dear Sir, Madam, or to whom this may concern,

my name is Jan Failenschmid and I am a Ph.D. student at Tilburg University.
For my project I have been looking into different types of kernel regression 
estimators and corresponding R functions.
While comparing different functions I noticed that stats::ksmooth returned 
different estimates for the same bandwidth
as other kernel regression estimators that should be equivalent (i.e. the local 
polynomial estimators KernSmooth::locpoly and
locpol::locpol with degree 0). However, when optimizing the bandwidth of 
ksmooth separately using the same loss function, I find comparable estimates to 
the other two estimators for a (larger) different bandwidth. To confirm this, I 
wrote my own Nadaraya-Watson kernel regression estimator, which is consistent 
with the two local polynomial estimators and shows the same discordance with 
ksmooth.

This led me to the suspicion that the bandwidth that is passed to kmooth is 
rescaled or transformed within the function. Unfortunately, I was not able to 
confirm this with either the code of the function or the documentation. It 
would be of great help to me if you could clarify this for me.

Thank you very much for your time and help in advance.

Kind regards,

Jan Failenschmid

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is it possible to get a downward pointing solid triangle plotting symbol in R?

2023-10-06 Thread Jan van der Laan
Another thing that I considered, but doesn't seem to be supported, is 
rotating the symbols. I noticed that that does work with text. So you 
could use a arrow symbol and then specify the angle aesthetic. But this 
still relies on text and unfortunately there are no arrowlike symbols in 
ASCII: except perhaps 'V'.


I can't say how the support for non-ascii text is over different OS-es 
and localities. 
https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Encoding-issues 
gives some 'hints'





On 06-10-2023 14:21, Chris Evans via R-help wrote:
Thanks again Jan.  That is lovely and clean and I probably should have 
seen that option.


I had anxieties about the portability of using text.  (The function will 
end up in my
https://github.com/cpsyctc/CECPfuns package so I'd like it to be fairly 
immune to character

sets and different platforms in different countries.

I'm morphing this question a lot now but I guess it's still on topic 
really.  I know
I need to put in some time to understand the complexities of R and 
platforms (I'm
pretty exclusively on Linux, Ubuntu or Debian now so have mostly done 
the ostrich thing
about these issues though I do hit problems exchanging things with my 
Spanish speaking
colleagues).  Jan or anyone: any simple reassurance or pointers to 
resources I should

best use for homework about these issues?

TIA (again!)

Chris

On 06/10/2023 12:55, Jan van der Laan wrote:

You are right, sorry.

Another possible solution then: use geom_text instead of geom_point 
and use a triangle shape as text:


ggplot(data = tmpTibPoints,
   aes(x = x, y = y)) +
  geom_polygon(data = tmpTibAreas,
   aes(x = x, y = y, fill = a)) +
  geom_text(data = tmpTibPoints,
 aes(x = x, y = y, label = "▼", color = c),
 size = 6) + guides(color = FALSE)


[much snipped]




__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is it possible to get a downward pointing solid triangle plotting symbol in R?

2023-10-06 Thread Jan van der Laan

You are right, sorry.

Another possible solution then: use geom_text instead of geom_point and 
use a triangle shape as text:


ggplot(data = tmpTibPoints,
   aes(x = x, y = y)) +
  geom_polygon(data = tmpTibAreas,
   aes(x = x, y = y, fill = a)) +
  geom_text(data = tmpTibPoints,
 aes(x = x, y = y, label = "▼", color = c),
 size = 6) + guides(color = FALSE)



On 06-10-2023 12:11, Chris Evans via R-help wrote:
Sadly, no.  Still shows the same legend with both sets of fill 
mappings.  I have found a workaround, sadly
much longer than yours (!) that does get me what I want but it is a real 
bodge.  Still interested to see
if there is a way to create a downward pointing solid symbol but here is 
my bodge using new_scale_fill()
and new_scale_color() from the ggnewscale package (many thanks to Elio 
Campitelli for that).


library(tidyverse)
library(ggnewscale) # allows me to change the scales used
tibble(x = 2:9, y = 2:9,
    ### I have used A:C to ensure the changes sort in the correct 
order to avoid the messes of using shape to scale an ordinal variable
    ### have to say that seems a case where it is perfectly sensible 
to map shapes to an ordinal variable, scale_shape_manual() makes

    ### this difficult hence this bodge
    c = c(rep("A", 5), "B", rep("C", 2)),
    change = c(rep("Deteriorated", 5), "No change", rep("Improved", 
2))) %>%

   ### this is just keeping the original coding but not used below
   mutate(change = ordered(change,
   levels = c("Deteriorated", "No change", 
"Improved"))) -> tmpTibPoints

### create the area mapping
tibble(x = c(1, 5, 5, 1), y = c(1, 1, 5, 5), a = rep("a", 4)) -> 
tmpTibArea1
tibble(x = c(5, 10, 10, 5), y = c(1, 1, 5, 5), a = rep("b", 4)) -> 
tmpTibArea2
tibble(x = c(1, 5, 5, 1), y = c(5, 5, 10, 10), a = rep("c", 4)) -> 
tmpTibArea3
tibble(x = c(5, 10, 10, 5), y = c(5, 5, 10, 10), a = rep("d", 4)) -> 
tmpTibArea4

bind_rows(tmpTibArea1,
   tmpTibArea2,
   tmpTibArea3,
   tmpTibArea4) -> tmpTibAreas
### now plot
ggplot(data = tmpTib,
    aes(x = x, y = y)) +
   geom_polygon(data = tmpTibAreas,
    aes(x = x, y = y, fill = a),
    alpha = .5) +
   scale_fill_manual(name = "Areas",
     values = c("orange", "purple", "yellow", "brown"),
     labels = letters[1:4]) +
   ### next two lines use ggnewscale functions to reset the scale mappings
   new_scale_fill() +
   new_scale_colour() +
   ### can now use the open triangles and fill aesthetic to map them
   geom_point(data = tmpTibPoints,
  aes(x = x, y = y, shape = c, fill = c, colour = c),
  size = 6) +
   ### use the ordered variable c to get mapping in desired order
   ### which, sadly, isn't the alphabetical order!
   scale_shape_manual(name = "Change",
    values = c("A" = 24,
   "B" = 23,
   "C" = 25),
    labels = c("Deteriorated",
   "No change",
   "Improved")) +
   scale_colour_manual(name = "Change",
    values = c("A" = "red",
   "B" = "grey",
   "C" = "green"),
    labels = c("Deteriorated",
   "No change",
   "Improved")) +
   scale_fill_manual(name = "Change",
    values = c("A" = "red",
   "B" = "grey",
   "C" = "green"),
    labels = c("Deteriorated",
   "No change",
   "Improved"))

That gives the attached plot which is really what I want.  Long bodge 
though!*

*

On 06/10/2023 11:50, Jan van der Laan wrote:


Does adding

, show.legend = c("color"=TRUE, "fill"=FALSE)

to the geom_point do what you want?

Best,
Jan

On 06-10-2023 11:09, Chris Evans via R-help wrote:

library(tidyverse)
tibble(x = 2:9, y = 2:9, c = c(rep("A", 5), rep("B", 3))) -> 
tmpTibPoints
tibble(x = c(1, 5, 5, 1), y = c(1, 1, 5, 5), a = rep("a", 4)) -> 
tmpTibArea1
tibble(x = c(5, 10, 10, 5), y = c(1, 1, 5, 5), a = rep("b", 4)) -> 
tmpTibArea2
tibble(x = c(1, 5, 5, 1), y = c(5, 5, 10, 10), a = rep("c", 4)) -> 
tmpTibArea3
tibble(x = c(5, 10, 10, 5), y = c(

Re: [R] Is it possible to get a downward pointing solid triangle plotting symbol in R?

2023-10-06 Thread Jan van der Laan



Does adding

, show.legend = c("color"=TRUE, "fill"=FALSE)

to the geom_point do what you want?

Best,
Jan

On 06-10-2023 11:09, Chris Evans via R-help wrote:

library(tidyverse)
tibble(x = 2:9, y = 2:9, c = c(rep("A", 5), rep("B", 3))) -> tmpTibPoints
tibble(x = c(1, 5, 5, 1), y = c(1, 1, 5, 5), a = rep("a", 4)) -> 
tmpTibArea1
tibble(x = c(5, 10, 10, 5), y = c(1, 1, 5, 5), a = rep("b", 4)) -> 
tmpTibArea2
tibble(x = c(1, 5, 5, 1), y = c(5, 5, 10, 10), a = rep("c", 4)) -> 
tmpTibArea3
tibble(x = c(5, 10, 10, 5), y = c(5, 5, 10, 10), a = rep("d", 4)) -> 
tmpTibArea4

bind_rows(tmpTibArea1,
   tmpTibArea2,
   tmpTibArea3,
   tmpTibArea4) -> tmpTibAreas
ggplot(data = tmpTib,
    aes(x = x, y = y)) +
   geom_polygon(data = tmpTibAreas,
    aes(x = x, y = y, fill = a)) +
   geom_point(data = tmpTibPoints,
  aes(x = x, y = y, fill = c),
  pch = 24,
  size = 6)


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] overlay shaded area in base r plot

2023-09-19 Thread Jan van der Laan

Shorter/simpler alternative for adding a alpha channel

adjustcolor("lightblue", alpha = 0.5)


So I would use something like:


# Open new plot; make sure limits are ok; but don't plot
plot(0, 0, xlim=c(1,20),
  ylim = range(c(mean1+sd1, mean2+sd2, mean1-sd1, mean2-sd2)),
  type="n", las=1,
  xlab="Data",
  ylab=expression(bold("Val")),
  cex.axis=1.2,font=2,
  cex.lab=1.2)
polygon(c(1:20,20:1),
  c(mean1[1:20]+sd1[1:20],mean1[20:1]-sd1[20:1]),
  col=adjustcolor("blue", 0.5),
  border = NA)
polygon(c(1:20,20:1),
  c(mean2[1:20]+sd2[1:20],mean2[20:1]-sd2[20:1]),
  col=adjustcolor("yellow", 0.5),
  border = NA)
lines(1:20, mean1,lty=1,lwd=2,col="blue")
lines(1:20, mean2,lty=1,lwd=2,col="yellow")


On 19-09-2023 09:16, Ivan Krylov wrote:

В Tue, 19 Sep 2023 13:21:08 +0900
ani jaya  пишет:


polygon(c(1:20,20:1),c(mean1[1:20]+sd1[1:20],mean1[20:1]),col="lightblue")
polygon(c(1:20,20:1),c(mean1[1:20]-sd1[1:20],mean1[20:1]),col="lightblue")
polygon(c(1:20,20:1),c(mean2[1:20]+sd2[1:20],mean2[20:1]),col="lightyellow")
polygon(c(1:20,20:1),c(mean2[1:20]-sd2[1:20],mean2[20:1]),col="lightyellow")


If you want the areas to overlap, try using a transparent colour. For
example, "lightblue" is rgb(t(col2rgb("lightblue")), max = 255) →
"#ADD8E6", so try setting the alpha (opacity) channel to something less
than FF, e.g., "#ADD8E688".

You can also use rgb(t(col2rgb("lightblue")), alpha = 128, max = 255)
to generate hexadecimal colour strings for a given colour name and
opacity value.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Obtaining R-squared from All Possible Combinations of Linear Models Fitted

2023-07-18 Thread Jan van der Laan

The dredge function has a `extra` argument to get other statistics:

optional additional statistics to be included in the result, provided as 
functions, function names or a list of such (preferably named or 
quoted). As with the rank argument, each function must accept as an 
argument a fitted model object and return (a value coercible to) a 
numeric vector. This could be, for instance, additional information 
criteria or goodness-of-fit statistics. The character strings "R^2" and 
"adjR^2" are treated in a special way and add a likelihood-ratio based 
R² and modified-R² to the result, respectively (this is more efficient 
than using r.squaredLR directly).


HTH
Jan



On 17-07-2023 19:24, Paul Bernal wrote:

Dear friends,

I need to automatically fit all possible linear regression models (with all
possible combinations of regressors), and found the MuMIn package, which
has the dredge function.

This is the dataset  I am working with:

dput(final_frame)

structure(list(y = c(41.9, 44.5, 43.9, 30.9, 27.9, 38.9, 30.9,
28.9, 25.9, 31, 29.5, 35.9, 37.5, 37.9), x1 = c(6.6969, 8.7951,
9.0384, 5.9592, 4.5429, 8.3607, 5.898, 5.6039, 4.9176, 6.2712,
5.0208, 5.8282, 5.9894, 7.5422), x4 = c(1.488, 1.82, 1.5, 1.121,
1.175, 1.777, 1.24, 1.501, 0.998, 0.975, 1.5, 1.225, 1.256, 1.69
), x8 = c(22, 50, 23, 32, 40, 48, 51, 32, 42, 30, 62, 32, 40,
22), x2 = c(1.5, 1.5, 1, 1, 1, 1.5, 1, 1, 1, 1, 1, 1, 1, 1.5),
 x7 = c(3, 4, 3, 3, 3, 4, 3, 3, 4, 2, 4, 3, 3, 3)), class =
"data.frame", row.names = c(NA,
-14L))

I started with the all regressor model, which I called globalmodel as
follows:
#Fitting Regression model with all possible combinations of regressors
options(na.action = "na.fail") # change the default "na.omit" to prevent
models
globalmodel <- lm(y~., data=final_frame)

Then, the following code provides the different coefficients (for
regressors and the intercept) for each of the possible model combinations:
combinations <- dredge(globalmodel)
print(combinations)
  I would like to retrieve  the R-squared generated by each combination, but
have not been able to get it thus far.

Any guidance on how to retrieve the R-squared from all linear model
combinations would be greatly appreciated.

Kind regards,
Paul

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting directly to memory?

2023-05-28 Thread Jan van der Laan



Perhaps the ragg package? That has an `agg_capture` device "that lets 
you access the device buffer directly from your R session." 
https://github.com/r-lib/ragg


HTH,
Jan




On 28-05-2023 13:46, Duncan Murdoch wrote:
Is there a way to open a graphics device that plots entirely to an array 
or raster in memory?  I'd prefer it to use base graphics, but grid would 
be fine if it makes a difference.


For an explicit example, I'd like to do the equivalent of this:

   filename <- tempfile(fileext = ".png")
   png(filename)
   plot(1:10, 1:10)
   dev.off()

   library(png)
   img <- readPNG(filename)

   unlink(filename)


which puts the desired plot into the array `img`, but I'd like to do it 
without needing the `png` package or the temporary file.


A possibly slightly simpler request would be to do this only for 
plotting text, i.e. I'd like to rasterize some text into an array.


Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nth kludge

2023-03-09 Thread Jan van der Laan

Hi Avi, list,

Below an alternative suggestion:

func <- function(a, b, c) {
  list(a, b, c)
}

1:3 |> list(x = _) |> with(func(a, x, b))


Not sure if this is more readable than some of the other solutions, e.g. 
your solution, but you could make a variant of with more specific for 
this use case:


named <- function(expr, ...) {
  eval(substitute(expr), list(...), enclos = parent.frame())
}

then you can do:

1:3 |> named(func(1, x, mean(x)), x= _)

or perhaps you can even simplify further using the same strategy:


dot <- function(.,  expr) {
  eval(substitute(expr), list(. = .), enclos = parent.frame())
}

1:3 |> dot(func(1, ., mean(.)))

This seams simpler than the lambda notation and more general than your 
solution. Not sure if this has any drawbacks.


HTH,
Jan



On 08-03-2023 21:23, avi.e.gr...@gmail.com wrote:

I see many are not thrilled with the concise but unintuitive way it is
suggested you use with the new R pipe function.
  
I am wondering if any has created one of a family of functions that might be

more intuitive if less general.
  
Some existing pipes simply allowed you to specify where in an argument list

to put the results from the earlier pipeline as in:
  
. %>% func(first, . , last)
  
In the above common case, it substituted into the second position.
  
What would perhaps be a useful variant is a function that does not evaluate

it's arguments and expects a first argument passed from the pipe and a
second argument that is a number like 2 or 3 and  a third  argument that is
the (name of) a function and remaining arguments.
  
The above might look like:
  
. %>% the_nth(2, func, first , last)
  
The above asks to take the new implicitly passed first argument which I will

illustrate with a real argument as it would also work without a pipeline:
  
the_nth(piped, 2, func, first, last)
  
So it would make a list out of the remaining arguments that looks like

list(first, last) and interpolate piped at position 2 to make list(first,
piped, last) and then use something like do.call()
  
do.call(func, list(first, piped, last))
  
I am not sure if this is much more readable, but seems like a

straightforward function to write, and perhaps a decent version could make
it into the standard library some year that is general and more useful than
the darn anonymous lambda notation.
  


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] foreign package: unable to read S-Plus objects

2023-01-17 Thread Jan van der Laan
You could try to see what stattransfer can make of it. They have a free 
version that imports only part of the data. You could use that to see if 
stattransfer would help and perhaps discover what format it is in.


HTH
Jsn


On 16-01-2023 23:22, Joseph Voelkel wrote:

Dear foreign maintainers and others,

I am trying to import a number of S-Plus objects into R. The only way I see how 
to do this is by using the foreign package.

However, when I try to do this I receive an error message. A snippet of code 
and the error message follows:

read.S(file.path(Spath, "nrand"))
Error in read.S(file.path(Spath, "nrand")) : not an S object

I no longer know the version of S-Plus in which these objects were created. I 
do know that I have printed documentation, dated July 2001, from S-Plus 6; and 
that all S-Plus objects were created in the 9/2004 -- 5/2005 range.

I am afraid that I simply have S-Plus objects that are not the S version 3 
files that the foreign package can read, yes? But I am still hoping that it may 
be possible to read these in.

I am not attaching some sample S-Plus objects to this email, because I  believe 
they will be stripped away as binary files. However, a sample of these files 
may be found at

https://drive.google.com/drive/folders/1wFVa972ciP44Ob2YVWfqk8SGIodzAXPv?usp=sharing
  (simdat is the largest file, at 469 KB)

Thank you for any assistance you may provide.

R 4.2.2
Microsoft Windows [Version 10.0.22000.1455]
foreign_0.8-83


Joe Voelkel
Professor Emeritus
RIT

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading very large text files into R

2022-09-29 Thread Jan van der Laan
You're sure the extra column is indeed an extra column? According to the 
documentation 
(https://artefacts.ceda.ac.uk/badc_datadocs/ukmo-midas/RH_Table.html) 
there should be 15 columns.


Could it, for example, be that one of the columns contains records with 
commas?


Jan



On 29-09-2022 15:54, Nick Wray wrote:

Hello   I may be offending the R purists with this question but it is
linked to R, as will become clear.  I have very large data sets from the UK
Met Office in notepad form.  Unfortunately,  I can’t read them directly
into R because, for some reason, although most lines in the text doc
consist of 15 elements, every so often there is a sixteenth one and R
doesn’t like this and gives me an error message because it has assumed that
every line has 15 elements and doesn’t like finding one with more.  I have
tried playing around with the text document, inserting an extra element
into the top line etc, but to no avail.

Also unfortunately you need access permission from the Met Office to get
the files in question so this link probably won’t work:

https://catalogue.ceda.ac.uk/uuid/bbd6916225e7475514e17fdbf11141c1

So what I have done is simply to copy and paste the text docs into excel
csv and then read them in, which is time-consuming but works.  However the
later datasets are over the excel limit of 1048576 lines.  I can paste in
the first 1048576 lines but then trying to isolate the remainder of the
text doc to paste it into a second csv doc is proving v difficult – the
only way I have found is to scroll down by hand and that’s taking ages.  I
cannot find another way of editing the notepad text doc to get rid of the
part which I have already copied and pasted.

Can anyone help with a)ideally being able to simply read the text tables
into R  or b)suggest a way of editing out the bits of the text file I have
already pasted in without laborious scrolling?

Thanks Nick Wray

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to represent tree-structured values

2022-05-30 Thread Jan van der Laan
For visualising hierarchical data a treemap can also work well. For 
example, using the treemap package:


n <- 1000

library(data.table)
library(treemap)

dta <- data.table(
  level1 = sample(LETTERS[1:5], n, replace = TRUE),
  level2 = sample(letters[1:5], n, replace = TRUE),
  level3 = sample(1:9, n, replace = TRUE),
  event = sample(0:1, n, replace = TRUE)
  )

tab <- dta[, .(n = .N, rate = sum(event)/.N),
  by = .(level1, level2, level3)]

treemap(tab, index = names(tab)[1:3], vSize = "n", vColor = "rate",
  type = "value", fontsize.labels = 20*c(1, 0.7, 0))


--

Jan




On 30-05-2022 11:40, Jim Lemon wrote:

Hi Richard,
Thinking about this, you might also find intersectDiagram, also in
plotrix, to be useful.

Jim

On Mon, May 30, 2022 at 4:37 PM Jim Lemon  wrote:

Hi Richard,
Some years ago I had a try at illustrating Multiple Causes of Death
(MCoD) data. I settled on what is sometimes called a "sizetree". You
can see some examples in the sizetree function help page in "plotrix".
Unfortunately I can't use the original data as it was confidential.

Jim

On Mon, May 30, 2022 at 2:55 PM Richard O'Keefe  wrote:

There is a kind of data I run into fairly often
which I have never known how to represent in R,
and nothing I've tried really satisfies me.

Consider for example
  ...
  - injuries
...
- injuries to limbs
  ...
  - injuries to extremities
...
- injuries to hands
  - injuries to dominant hand
  - injuries to non-dominant hand
...
  ...
...

This isn't ordinal data, because there is no
"left to right" order on the values.  But there
IS a "part/whole" order, which an analysis should
respect, so it's not pure nominal data either.

As one particular example, if I want to
tabulate data like this, an occurrence of one
value should be counted as an occurrence of
*every* superordinate value.

Examples of such data include "why is this patient
being treated", "what drug is this patient being
treated with", "what geographic region is this
school from", "what biological group does this
insect belong to".

So what is the recommended way to represent
and the recommended way to analyse such data in R?

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] splitting data matrix into submatrices

2022-01-04 Thread Faheem Jan via R-help
I have data in a matrix form of order 1826*24 where 1826 represents the days 
and 24 hourly observations on each data. My objective is to split the matrix 
into working (Monday to Friday) and non-working (Saturday and Sunday) 
submatrices. Can anyone help me that how I will do that splitting using R?


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] vectorization of loops in R

2021-11-17 Thread Jan van der Laan

Have a look at the base functions tapply and aggregate.

For example see:
- 
https://cran.r-project.org/doc/manuals/r-release/R-intro.html#The-function-tapply_0028_0029-and-ragged-arrays 
,

- https://online.stat.psu.edu/stat484/lesson/9/9.2,
- or ?tapply and ?aggregate.

Also your current code seems to contain an error: `s = df[df$y == i,]` 
should be `s = df$z[df$y == i]` I think.


HTH,
Jan






On 17-11-2021 14:20, Luigi Marongiu wrote:

Hello,
I have a dataframe with 3 variables. I want to loop through it to get
the mean value of the variable `z`, as follows:
```
df = data.frame(x = c(rep(1,5), rep(2,5), rep(3,5)),
y = rep(letters[1:5],3),
z = rnorm(15),
stringsAsFactors = FALSE)
m = vector()
for (i in unique(df$y)) {
s = df[df$y == i,]
m = append(m, mean(s$z))
}
names(m) = unique(df$y)

(m)

a  b  c  d  e
-0.6355382 -0.4218053 -0.7256680 -0.8320783 -0.2587004
```
The problem is that I have one million `y` values, so the work takes
almost a day. I understand that vectorization will speed up the
procedure. But how shall I write the procedure in vectorial terms?
Thank you

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there a hash data structure for R

2021-11-03 Thread Jan van der Laan




On 03-11-2021 00:42, Avi Gross via R-help wrote:



Finally, someone mentioned how creating a data.frame with duplicate names
for columns is not a problem as it can automagically CHANGE them to be
unique. That is a HUGE problem for using that as a dictionary as the new
name will not be known to the system so all kinds of things will fail.


I think you are referring to my remark which was:

> However, the data.frame construction method will detect this and
> generate unique names (which also might not be what you want):

I didn't say this means that duplicate names aren't a problem; I just 
mentioned the the behaviour is different. Personally, I would actually 
prefer the behaviour of list (keep the duplicated name) with a warning.


Most of the responses seem to assume that the OP actually wants a hash 
table. Yes, he did ask for that and for a hash table an environment 
(with some work) would be a good option. But in many cases, where other 
languages would use a hash-table-like object (such as a dict) in R you 
would use other types of objects. Furthermore, for many operations where 
you might use hash tables to implement the operation, R has already 
built in options, for example %in%, match, duplicated. These are also 
vectorised; so two vectors: one with keys and one with values might 
actually be faster than an environment in some use cases.


Best,
Jan




And there are also packages for many features like sets as well as functions
to manipulate these things.

-Original Message-
From: R-help  On Behalf Of Bill Dunlap
Sent: Tuesday, November 2, 2021 1:26 PM
To: Andrew Simmons 
Cc: R Help 
Subject: Re: [R] Is there a hash data structure for R

Note that an environment carries a hash table with it, while a named list
does not.  I think that looking up an entry in a list causes a hash table to
be created and thrown away.  Here are some timings involving setting and
getting various numbers of entries in environments and lists.  The times are
roughly linear in n for environments and quadratic for lists.


vapply(1e3 * 2 ^ (0:6), f, L=new.env(parent=emptyenv()),

FUN.VALUE=NA_real_)
[1] 0.00 0.00 0.00 0.02 0.03 0.06 0.15

vapply(1e3 * 2 ^ (0:6), f, L=list(), FUN.VALUE=NA_real_)

[1]  0.01  0.03  0.15  0.53  2.66 13.66 56.05

f

function(n, L, V = sprintf("V%07d", sample(n, replace=TRUE))) {
 system.time(for(v in V)L[[v]]<-c(L[[v]],v))["elapsed"] }

Note that environments do not allow an element named "" (the empty string).

Elements named NA_character_ are treated differently in environments and
lists, neither of which is great.  You may want your hash table functions to
deal with oddball names explicitly.

-Bill

On Tue, Nov 2, 2021 at 8:52 AM Andrew Simmons  wrote:


If you're thinking about using environments, I would suggest you
initialize them like


x <- new.env(parent = emptyenv())


Since environments have parent environments, it means that requesting
a value from that environment can actually return the value stored in
a parent environment (this isn't an issue for [[ or $, this is
exclusively an issue with assign, get, and exists) Or, if you've
already got your values stored in a list that you want to turn into an
environment:


x <- list2env(listOfValues, parent = emptyenv())


Hope this helps!


On Tue, Nov 2, 2021, 06:49 Yonghua Peng  wrote:


But for data.frame the colnames can be duplicated. Am I right?

Regards.

On Tue, Nov 2, 2021 at 6:29 PM Jan van der Laan 

wrote:




True, but in a lot of cases where a python user might use a dict
an R user will probably use a list; or when we are talking about
arrays of dicts in python, the R solution will probably be a
data.frame (with

each

dict field in a separate column).

Jan




On 02-11-2021 11:18, Eric Berger wrote:

One choice is
new.env(hash=TRUE)
in the base package



On Tue, Nov 2, 2021 at 11:48 AM Yonghua Peng  wrote:


I know this is a newbie question. But how do I implement the
hash

structure

which is available in other languages (in python it's dict)?

I know there is the list, but list's names can be duplicated here.


x <- list(x=1:5,y=month.name,x=3:7)



x


$x

[1] 1 2 3 4 5


$y

   [1] "January"   "February"  "March" "April" "May"

  "June"


   [7] "July"  "August""September" "October"   "November"

"December"



$x

[1] 3 4 5 6 7



Thanks a lot.

  [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more,
see https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list

Re: [R] Is there a hash data structure for R

2021-11-02 Thread Jan van der Laan



Yes. A data.frame is basically a list where all elements are vectors of 
the same length. So this issue also exists in a data.frame. However, the 
data.frame construction method will detect this and generate unique 
names (which also might not be what you want):


> data.frame(a=1:3, a=1:3) 






  a a.1 





  1 1 
 1 





2 2   2 






  3 3   3

But still with a little effort you can still create a data.frame with 
multiple columns with the same name. But as Duncan Murdoch mentions you 
can usually control for that.



Best,
Jan




On 02-11-2021 11:32, Yonghua Peng wrote:

But for data.frame the colnames can be duplicated. Am I right?

Regards.

On Tue, Nov 2, 2021 at 6:29 PM Jan van der Laan <mailto:rh...@eoos.dds.nl>> wrote:



True, but in a lot of cases where a python user might use a dict an R
user will probably use a list; or when we are talking about arrays of
dicts in python, the R solution will probably be a data.frame (with
each
dict field in a separate column).

Jan




On 02-11-2021 11:18, Eric Berger wrote:
 > One choice is
 > new.env(hash=TRUE)
 > in the base package
 >
 >
 >
 > On Tue, Nov 2, 2021 at 11:48 AM Yonghua Peng mailto:y...@pobox.com>> wrote:
 >
 >> I know this is a newbie question. But how do I implement the
hash structure
 >> which is available in other languages (in python it's dict)?
 >>
 >> I know there is the list, but list's names can be duplicated here.
 >>
 >>> x <- list(x=1:5,y=month.name <http://month.name>,x=3:7)
 >>
 >>> x
 >>
 >> $x
 >>
 >> [1] 1 2 3 4 5
 >>
 >>
 >> $y
 >>
 >>   [1] "January"   "February"  "March"     "April"     "May" 
  "June"

 >>
 >>   [7] "July"      "August"    "September" "October" 
  "November"  "December"

 >>
 >>
 >> $x
 >>
 >> [1] 3 4 5 6 7
 >>
 >>
 >>
 >> Thanks a lot.
 >>
 >>          [[alternative HTML version deleted]]
 >>
 >> __
 >> R-help@r-project.org <mailto:R-help@r-project.org> mailing list
-- To UNSUBSCRIBE and more, see
 >> https://stat.ethz.ch/mailman/listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help>
 >> PLEASE do read the posting guide
 >> http://www.R-project.org/posting-guide.html
<http://www.R-project.org/posting-guide.html>
 >> and provide commented, minimal, self-contained, reproducible code.
 >>
 >
 >       [[alternative HTML version deleted]]
 >
 > __
 > R-help@r-project.org <mailto:R-help@r-project.org> mailing list
-- To UNSUBSCRIBE and more, see
 > https://stat.ethz.ch/mailman/listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help>
 > PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
<http://www.R-project.org/posting-guide.html>
 > and provide commented, minimal, self-contained, reproducible code.
 >

__
R-help@r-project.org <mailto:R-help@r-project.org> mailing list --
To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help>
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
<http://www.R-project.org/posting-guide.html>
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there a hash data structure for R

2021-11-02 Thread Jan van der Laan



True, but in a lot of cases where a python user might use a dict an R 
user will probably use a list; or when we are talking about arrays of 
dicts in python, the R solution will probably be a data.frame (with each 
dict field in a separate column).


Jan




On 02-11-2021 11:18, Eric Berger wrote:

One choice is
new.env(hash=TRUE)
in the base package



On Tue, Nov 2, 2021 at 11:48 AM Yonghua Peng  wrote:


I know this is a newbie question. But how do I implement the hash structure
which is available in other languages (in python it's dict)?

I know there is the list, but list's names can be duplicated here.


x <- list(x=1:5,y=month.name,x=3:7)



x


$x

[1] 1 2 3 4 5


$y

  [1] "January"   "February"  "March" "April" "May"   "June"

  [7] "July"  "August""September" "October"   "November"  "December"


$x

[1] 3 4 5 6 7



Thanks a lot.

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Getting different results with set.seed()

2021-08-19 Thread Jan van der Laan




What you could also try is check if the self coded functions use the 
random generator when defining them:


starting_seed <- .Random.seed

Step 1. Self-coded functions (these functions generate random numbers as 
well)


# check if functions have modified the seed:
all.equal(starting_seed, .Random.seed)

Step 2: set.seed (123)



What has also happened to me is that some of the functions I called had 
their own random number generator independent of that of R. For example 
using one in C/C++.


Do your functions do stuff in parallel? For example using the parallel 
or snow package? In that case you also have to set the seed in the 
parallel workers.


Best,
Jan









On 19-08-2021 11:25, PIKAL Petr wrote:

Hi

Did you try different order?

Step 2: set.seed (123)

Step 1. Self-coded functions (these functions generate random numbers as well)

Step 3: Call those functions.

Step 4: model results.

Cheers
Petr.

And BTW, do not use HTML formating, it could cause problems in text only list.


From: Shah Alam 
Sent: Thursday, August 19, 2021 10:10 AM
To: PIKAL Petr 
Cc: r-help mailing list 
Subject: Re: [R] Getting different results with set.seed()

Dear Petr,

It is more than 2000 lines of code with a lot of functions and data inputs. I
am not sure whether it would be useful to upload it. However, you are
absolutely right. I used

Step 1. Self-coded functions (these functions generate random numbers as well)

Step 2: set.seed (123)

Step 3: Call those functions.

Step 4: model results.

I close the R session and run the code from step 1. I get different results
for the same set of values for parameters.

Best regards,
Shah




On Thu, 19 Aug 2021 at 09:56, PIKAL Petr <mailto:petr.pi...@precheza.cz>
wrote:
Hi

Please provide at least your code preferably with some data to reproduce
this behaviour. I wonder if anybody could help you without such information.

My wild guess is that you used

set.seed(1234)

some code

the code used again

in which case you have to expect different results.

Cheers
Petr


-Original Message-
From: R-help <mailto:r-help-boun...@r-project.org> On Behalf Of Shah Alam
Sent: Thursday, August 19, 2021 9:46 AM
To: r-help mailing list <mailto:r-help@r-project.org>
Subject: [R] Getting different results with set.seed()

Dear All,

I was using set.seed to reproduce the same results for the discrete event
simulation model. I have 12 unknown parameters for optimization (just a
little background). I got a good fit of parameter combinations. However,
when I use those parameters combinations again in the model. I am getting
different results.

Is there any problem with the set.seed. I assume the set.seed should
produce the same results.

I used set.seed(1234).

Best regards,
Shah

   [[alternative HTML version deleted]]

__
mailto:R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Mean absolute error from data matrix

2021-06-23 Thread Faheem Jan via R-help
I have data matrix of order 24*2192 where 2192 are the days and 24 are hour's 
of a single day,so simple words I have 2192 days and each day having 24 
observations.the data matrix is divided into two matrix,the ist matrix is of 
order 24*1827 and second is of order 24*365. Suppose the ist column of the 
second matrix is Sunday then we choose each column of the first matrix having 
Sunday. The takeing the first column of data matrix is converted into vector 
and all the Sunday columns are converted into vectors. Then we calculate mean 
absolute errors for different pairs of the first vector of the second matrix 
with each vector of first matrix. Similarly process is repeated for the rest of 
the week days. It clear that such process is quite time consuming and hard if 
perform manually. Can any one provides the easiest way to do such 
problem.Regard 

Sent from Yahoo Mail on Android
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Read fst files

2021-06-09 Thread Jan van der Laan




read_fst is from the package fst. The fileformat fst uses is a binary 
format designed to be fast readable. It is a column  oriented format and 
compressed. So, to be able to work fst needs access to the file itself 
and wont accept a file connection as functions like read.table an 
variants accept.


Also, because it is a binary compressed format using a compression 
method that is fast to read, compressing also to zip seems to defeat the 
purpose of fst.


HTH,
Jan


On 09-06-2021 15:28, Duncan Murdoch wrote:

On 09/06/2021 9:12 a.m., Jeff Reichman wrote:

Duncan

Yea that will work. It appears to be related to setting my working 
dir, for what ever reason neither seem to work
(1) knitr::opts_knit$set(root.dir 
="~/My_Reference_Library/Regression") # from R Notebook or
(2) 
setwd("C:/Users/reichmaj/Documents/My_Reference_Library/Regression") # 
from R chunk


So it appears I can either (as you suggested) use two steps or combine 
but I need to enter the full path. Why other file types don't seem to 
need the full path ?


You need to read the documentation for read_fst() to find what it needs. 
  If it doesn't explain this, then you should report the issue to its 
author.




myObject <- 
read_fst(unz("C:/Users/reichmaj/Documents/My_Reference_Library/Regression/Datasest.zip", 
filename = "myFile.fst"))


Thank you. I guess just one of those R things


No, it's a read_fst() thing.

Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What is an alternative to expand.grid if create a long vector?

2021-04-20 Thread Jan van der Laan



This is an optimisation problem that you are trying to solve using a 
grid search. There are numerous methods for optimisation, see 
https://cran.r-project.org/web/views/Optimization.html for and overview 
for R. It really depends on the exact problem what method is appropriate.


As Petr said helping you decide which method to use does not fit on this 
list. Perhaps de overview linked to above (and the terms 'grid search' 
and 'optimization') can help you find an appropriate method.


HTH,
Jan


On 20-04-2021 09:02, PIKAL Petr wrote:

Hi



Keep your mails on the list. Actually you did not say much about your data and
the way how do you want to model them. There are plenty of modelling functions
in R starting with e.g. lm but I am not aware of a procedure in which you just
design your explanatory variables to set plausible model. But I am not expert
in statistics and this list is not ment for solving statistical problems.



Cheers

Petr





From: Shah Alam 
Sent: Monday, April 19, 2021 5:20 PM
To: PIKAL Petr 
Subject: Re: [R] What is an alternative to expand.grid if create a long
vector?



Dear Petr,



Thanks for your response. I am designing a model with 10 unknown parameters.
generating the combination of unknown parameters will be used in the model to
estimate the set of vectors that fits well to actual data. Is there any other
was to do it? I also used randomLHS function from lhs package. But, it did not
serve the purpose.



Best regards,

Shah Alam





On Mon, 19 Apr 2021 at 16:07, PIKAL Petr mailto:petr.pi...@precheza.cz> > wrote:

Hi

Actually expand.grid produces data frame and not vector. And dimension of
the data frame is "big"


dim(A)

[1] 1 4

str(A)

'data.frame':   1 obs. of  4 variables:
  $ Var1: num  0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.01 ...
  $ Var2: num  1e-04 1e-04 1e-04 1e-04 1e-04 1e-04 1e-04 1e-04 1e-04 1e-04
...
  $ Var3: num  0.38 0.38 0.38 0.38 0.38 0.38 0.38 0.38 0.38 0.38 ...
  $ Var4: num  0.12 0.12 0.12 0.12 0.12 0.12 0.12 0.12 0.12 0.12 ...
  - attr(*, "out.attrs")=List of 2
   ..$ dim : int [1:4] 100 100 100 100
   ..$ dimnames:List of 4
   .. ..$ Var1: chr [1:100] "Var1=0.001" "Var1=0.002" "Var1=0.003"
"Var1=0.004" ...
   .. ..$ Var2: chr [1:100] "Var2=0.000100" "Var2=0.0001090909"
"Var2=0.0001181818" "Var2=0.0001272727" ...
   .. ..$ Var3: chr [1:100] "Var3=0.380" "Var3=0.3804040"
"Var3=0.3808081" "Var3=0.3812121" ...
   .. ..$ Var4: chr [1:100] "Var4=0.120" "Var4=0.1206061"
"Var4=0.1212121" "Var4=0.1218182" ...




in case of 4 sequences 1e8 rows, 4 columns
in case of 10 sequences 1e20 rows and 10 columns
in your last example 1.4e8 rows and 10 columns which probably cross the
memory capacity of your PC.

Maybe you could increase memory of you PC. If I am correct to store the
first you need about 3.2GB, to strore the last 11.2 GB.

May I ask what you want to do with such a big object?

Cheers
Petr


-Original Message-
From: R-help mailto:r-help-boun...@r-project.org> > On Behalf Of Shah Alam
Sent: Monday, April 19, 2021 2:36 PM
To: r-help mailing list mailto:r-help@r-project.org>
  >
Subject: [R] What is an alternative to expand.grid if create a long

vector?


Dear All,

I would like to know that is there any problem in *expand.grid* function

or it

is a limitation of this function.

I am trying to create a combination of elements using expand.grid

function.


A <- expand.grid(
c(seq(0.001, 0.1, length.out = 100)),
c(seq(0.0001, 0.001, length.out = 100)), c(seq(0.38, 0.42, length.out =

100)),

c(seq(0.12, 0.18, length.out = 100)))

Four combinations work fine. However, If I increase the combinations up to
ten. The following error appears.

  A <- expand.grid(
c(seq(0.001, 1, length.out = 100)),
c(seq(0.0001, 0.001, length.out = 100)), c(seq(0.38, 0.42, length.out =

100)),

c(seq(0.12, 0.18, length.out = 100)), c(seq(0.01, 0.04, length.out =

100)),

c(seq(0.0001, 0.001, length.out = 100)), c(seq(0.0001, 0.001, length.out =
100)), c(seq(0.001, 0.01, length.out = 100)), c(seq(0.01, 0.3, length.out

= 100))

)

*Error in rep.int <http://rep.int>  <http://rep.int>(rep.int
<http://rep.int>  <http://rep.int>(seq_len(nx),
rep.int <http://rep.int>  <http://rep.int>(rep.fac, nx)), orep) :   invalid
'times' value*

After reducing the length to 10. It produced a different type of error

A <- expand.grid(
c(seq(0.001, 0.005, length.out = 10)),
c(seq(0.0001, 0.0005, length.out = 10)), c(seq(0.38, 0.42, length.out =

5)),

c(seq(0.12, 0.18, length.out = 7)), c(seq(0.01, 0.04, length.out = 5)),
c(seq(0.0001, 0.001, length.out = 10)), c(seq(0.0001, 0.001, length.out =

10)),

c(seq(0.001, 0.01, length.out = 10)), c(seq(0.1, 0.8, length.out = 8))
)

*Error: canno

Re: [R] What is an alternative to expand.grid if create a long vector?

2021-04-20 Thread Jan van der Laan



But even if you could have a generator that is superefficient and 
perform an calculation that is superfast the number of elements is 
ridiculously large.


If we take 1 nanosec per element; the computation would still take:

> (100^10)*1E-9/3600
[1] 2778

hours, or

> (100^10)*1E-9/3600/24/365
[1] 3170.979

years.

--
Jan








On 20-04-2021 03:46, Avi Gross via R-help wrote:

Just some thoughts I am considering about the issue of how to make giant 
objects in memory without making them giant or all in memory.

As stupid as this sounds, when things get really big, it can mean not only 
processing your data in smaller amounts but using other techniques than asking 
expand.grid to create all possible combinations in advance.

Some languages like python allow generators that yield one item at a time and 
are called until exhausted, which sounds more like your usage. A single 
function remains resident in memory and each time it is called it uses the 
resident values in a calculation and returns the next. That approach may not 
work well with the way expand.grid works.

So a less efficient way would be to write your own deeply nested loop that 
generates one set of ten or so variables each time through the deepest nested 
loop that you can use one at a time. Alternatively, you can use such a loop to 
write a line at a time in something like a .CSV format and later read N lines 
at a time from the file or even have multiple programs work in parallel by 
taking their own allocations after ignoring the lines not meant for them, or 
some other method.

Deeply nested loops in R tend to be slow, as I have found out, which is indeed 
why I switched to using pmap() on a data.frame made using expand.grid first. 
But if your needs are exorbitant and you have limited memory, 

Can you squeeze some memory out of your design? Your data seems highly 
repetitive and if you really want to store something like this in a column:
c(seq(0.001, 1, length.out = 100))

The size of that, for comparison, is:

object.size(seq(0.001, 1, length.out = 100))
848 bytes

So it is 8 bytes per number plus some overhead.

Then consider storing something like that another way. First, the c() wrapper 
around the above is redundant, albeit harmless. Why not store this:
1L:100L

object.size(1L:100L)
448 bytes

So, four bytes per number plus some overhead.

That stores integers between 1 and 100 and in your case that means that later 
you can divide by a thousand or so to get the number you want each time but not 
store a full double-precision number.

And if you use factors, it may take less space. I note some of your other 
values pick different starting and ending points but in all cases you ask for 
100 equally-spaced values to be calculated by seq() which is fine but you could 
simply record a factor with umpteen specific values as either doubles or 
integers and if expand.grid honors that, it would use less space in any final 
output.  My experiments (not shown here) suggest you can easily cut sizes in 
half and perhaps more with judicious usage.

Perhaps finding or writing a more efficient loop in a C or C++ function would 
allow a way to loop through all possibilities more efficiently and provide a 
function for it to call on each iteration. Depending on your need, that can do 
a calculation using local variables and perhaps add a line to an output file, 
or add another set of values to a vector or other data structure that gets 
returned at the end of processing.

One possibility to consider is using an on-line resource, perhaps paying a fee, 
that will run your R program for you in an environment with more allowed 
resources like memory:

  https://rstudio.cloud/

Some of the professional options allow 8 GB of memory and perhaps 4 CPU. You 
can, of course, configure your own machine to have more memory or perhaps 
allocate lots more swap space and allow your process to abuse it.

There are many possible solutions but also consider if the sizes and amounts 
you are working on are realistic. I worked on a project a while ago where I 
generated a huge amount of instances with 500 iterations per instance and was 
asked to bump that up to 10,000 per instance (20 times as much) just to show 
the results were similar and that 500 had been enough. It ran for DAYS and 
luckily the rest of the project went back to more manageable numbers.

So, back to your scenario, I wonder if the regularity of your data would allow 
interesting games to be played. Imagine smaller combinations of say 10 levels 
each and for each row in the resulting data.frame, expand that out again so the 
number 2,3,4 (using just three for illustration) becomes (2:29, 3:39, 4:49) and 
is given to expand.grid to make a smaller local one-use expansion table to use. 
Your original giant problem is converted to making a modest table that for each 
row expands to a second modest table that is used and immediately discarded and 
replaced by a s

[R] Check accuracy of the model

2021-04-17 Thread Faheem Jan via R-help
   
   -
Hi, hope that you will be fine, I have a problem with functional time series,   
I am working with the hourly electricity spot price data, due to the large 
dimensionality I convert the discrete data into functional data. The model I 
use the functional autoregressive model of order more than one, as per my 
knowledge there no R package available to deal with such model, so I apply an 
alternative method, using the functional principal components (FPC's) as 
dimensional reduction, utilizing the associated principal components for the 
forecasting through multivariate time series model. Then I convert these 
forecast scores into functional curves through Karhunen-Loeve decomposition 
into a functional form, in such a way I obtained a forecast of each day as a 
single curve. Know, to check the accuracy of the model I want to calculate 
percentage mean square error or mean absolute error. know my problem start from 
here, So I want to reverse back each curve into 24 discrete points, is there is 
any package in R which is helpful in dealing with such a problem. I will be 
waiting
  for your fruitful reply in this regard.

   - 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] /usr/local/lib/R/site-library is not writable

2021-04-08 Thread Jan van der Laan



I would actually go a step in the other direction: per project 
libraries. For example by adding a .Rprofile file to your project 
directory. This ensures that everybody working on a project uses the 
same version of the packages (even on different machines e.g. on shared 
folders).


This can give issues when a new version of R arrives, but that is 
usually easy to solve. Either hard code the path to the old R-version or 
decide to update all packages in a project to the new R-version (and 
test that everything is still working ok).


We have the most often used packages installed centrally on the 
server/network, so I actually usually end up with a mixture of central, 
personal and project libraries. Theory vs practice.


HTH,
Jan



On 08-04-2021 02:58, Dirk Eddelbuettel wrote:

Hi Gene,

"It's complicated". (Not really, but listen for a sec...)

We need to ship a default policy that makes sense for all / most
situations.  So

- users cannot write into /usr/local/lib/R/site-library -- unless they are
   set up to, but adding them to the 'group' that owns that directory

- root can (but ideally one should not run as root as one generally does not
   now what code you might get slipped in a tar.gz); but root can enable users

- so we recommend letting (some or all) users write there by explicitly
   adding them to an appropriate group.

Personally, I do not think personal libraries are a good idea on shared
machines because you can end up with a different set of package (versions)
than your colleague on the same machine.  And or you running shiny from $HOME
have different packages than shiny running as server. And on and on. Other
people differ, and that is fine. If one wants personal libraries one can.

I must have explained the reasoning and fixes a dozen times each on
r-sig-debian (where you could have asked this too) and StackOverflow. At
least the latter can be searched so look at this set:
https://stackoverflow.com/search?q=user%3Ame+is%3Aanser+%2Fusr%2Flocal%2Flib%2FR%2Fsite-library

Happy to take it offline too, and who knows, we even get to meet for a coffee
one of these days.

Hope this helps, Dirk



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] forecast accuracy

2021-02-17 Thread Faheem Jan via R-help
I am new in the functional time series,  my question may be stupid as  I am 
new, I am  functional forecasting one year a head, Know I want to check the 
forecast accuracy by calculating the mean absolute percentage error, but I am 
unable to due this R, please help me or suggest me any link which help me  to 
solve my problem, Regard


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] forecast accuracy

2021-02-05 Thread Faheem Jan via R-help
I am working in the functional time series, I obtain the one year ahead 
forecast in the functional format, Know i want to forecast accuracy for example 
mean absolute percentage error in R, please help how i do this in R



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] forecast accuracy

2021-02-04 Thread Faheem Jan via R-help
I am working in the functional time series, I obtain the one year ahead 
forecast in the functional format, Know i want to forecast accuracy for example 
mean absolute percentage error in R, please help how i do this in R


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Col names in a data frame

2021-01-21 Thread Jan T. Kim via R-help
it looks to me that the names are cranked through make.names for
data frames case while that doesn't happen for matrices. Peeking
into the `colnames<-` code supports this idea, but that in turn
uses `names<-` which is a primitive and so defies further easy
peeking.

The data.frame function provides the check.names parameter to
switch this on / off, but for other classes this checking doesn't
seem to be provided.

Perhaps the idea behind this discrepancy is to enable the use of
the $ operator to access columns of the data frame, while that's
not possible for matrices anyway. (Personally, I don't find the
results of make.names that useful, though, and I tend to sanitise
them myself when working with data frames with unwieldy column
names).

Best regards, Jan


On Thu, Jan 21, 2021 at 03:58:44PM -0500, Bernard McGarvey wrote:
> Here is an example piece of code to illustrate an issue:
> 
> rm(list=ls()) # Clear Workspace
> #
> Data1 <- matrix(data=rnorm(9,0,1),nrow=3,ncol=3)
> Colnames1 <- c("(A)","(B)","(C)")
> colnames(Data1) <- Colnames1
> print(Data1)
> DataFrame1 <- data.frame(Data1)
> print(DataFrame1)
> colnames(DataFrame1) <- Colnames1
> print(DataFrame1)
> 
> The results I get are:
> 
> (A)(B)(C)
> [1,]  0.4739417  1.3138868  0.4262165
> [2,] -2.1288083  1.0333770  1.1543404
> [3,] -0.3401786 -0.7023236 -0.2336880
> X.A.   X.B.   X.C.
> 1  0.4739417  1.3138868  0.4262165
> 2 -2.1288083  1.0333770  1.1543404
> 3 -0.3401786 -0.7023236 -0.2336880
>  (A)(B)(C)
> 1  0.4739417  1.3138868  0.4262165
> 2 -2.1288083  1.0333770  1.1543404
> 3 -0.3401786 -0.7023236 -0.2336880
> 
> so that when I make the matrix with headings the parentheses are replaced by 
> periods but I can add them after creating the data frame and the column 
> headings are correct. 
> 
> Any ideas on why this occurs?
> 
> Thanks
> 
> 
> Bernard McGarvey
> Director, Fort Myers Beach Lions Foundation, Inc.
> Retired (Lilly Engineering Fellow).
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with connection issue for R (just joined, leading R for our agency)

2020-12-15 Thread Jan van der Laan

Alejandra,

If it was initially working ok, I would first check with the IT 
department if there has been a change to the configuration of the 
firewall, virus scanners, file system etc. as these can affect the 
performance of R-studio. R-studio uses a client-server setup on your 
machine, so a firewall/malware scanner inspecting all communication 
between R-studio and the R session can have a large effect. If  you 
can't find the problem, you are probably better of asking at the 
R-studio fora. A similar question was asked a while back: 
https://community.rstudio.com/t/rstudio-suddenly-slow-processing/4959; 
perhaps some of the solutions proposed also work for you.


As an alternative to Emacs/R-studio you could also have a look at visual 
studio code. It has a R-plugin. If your organisation is microsoft 
oriented there might already be a chance that it is available. You need 
a relatively recent version though.


HTH,
Jan




On 14-12-2020 12:54, Michael Dewey wrote:
Just to add to Petr's comment there are other basic editors with syntax 
highlighting like Notepad++ which are also OK if you want a fairly 
minimalist approach.


Michael

On 14/12/2020 08:16, PIKAL Petr wrote:

Hallo Alejandra

Although RStudio and ESS could help with some automation (each with 
its own
way), using R alone is not a big problem, especially if you are not 
familiar

with Emacs basics and perceiving RStudio issues. I use R with simple
external editor - it could be notepad but I could recommend TINN-R

https://sourceforge.net/projects/tinn-r/

which has syntax highlighting and works smoothly if R is console is 
set to

multiple windows. It is also quite easy to manage.

Good luck with R.

Cheers
Petr


-Original Message-
From: R-help  On Behalf Of Alejandra 
Barrio

Gorski
Sent: Tuesday, December 8, 2020 7:48 PM
To: R-help@r-project.org
Subject: [R] Help with connection issue for R (just joined, leading R 
for

our

agency)

Dear fellow R users,

Greetings, I am new to this list. I joined because I am pioneering 
the use

of R

for the agency I work for. I essentially work alone and would like to

reach

out for help on an issue I have been having. Here it is:

    - From one day to the next, my RStudio does not execute commands 
when

I
    press ctrl + enter. Nothing happens, and then after a few minutes 
out

of
    nowhere, it runs everything at once. This makes it very hard to 
do my

work.

    - I tried uninstalling and re-installing both R and Rstudio, but the
    error comes up again. I tested commands on my R program alone, 
and it

works
    fine there. It could be the way that Rstudio connects to R.
    - I am on a Windows 10 computer. I work for a government agency so
there
    may be a few firewall/virus protection issues.

I would love any pointers.

Thank you,
Alejandra

--

*Alejandra Barrio*
Linkedin <https://www.linkedin.com/in/alejandra-barrio/> | Website
<https://www.ocf.berkeley.edu/~alejandrabarrio/>
MPP | M.A., International and Area Studies University of California,

Berkeley


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.






__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help in R code

2020-10-18 Thread Faheem Jan via R-help
Good morning,  Please help me to code this code in R.
I working in the multivariate time series data, know my objective is that to 
one year forecast of the hourly time series data, using first five as a 
training set and the remaining one year as validation. For this  I transform 
the the data into functional data through Fourier basis functional, apply 
functional principle components as dimensional reduction explaining a specific 
amount of variation   , using the corresponding  functional principle 
components scores. I use the VAR model on those FPCscores for forecasting one 
day ahead forecast, know my problem is that i choose four Fpc scores which give 
only four value in a single day, I want the forecast for 24 hours not only 4, 
and then i want to transform it back to the original functional data. for the 
understanding i am sharing my code (1) transform of the multivariate time 
series data in functional data(2) the functional principle components and the 
corresponding scores(3) I use functional final prediction error for the 
selection of the parameters on the VAR model(4) Using VAR for the analysis and 
forecasting .(1) nb = 23 # number of basis functions for the data  fbf = 
create.fourier.basis(rangeval=c(0,1), nbasis=nb) # basis for data  
args=seq(0,1,length=24)  fdata1=Data2fd(args,y=t(mat),fbf) # functions 
generated from discretized y(2) ffpe = fFPE(fdata1, Pmax=10)  d.hat = ffpe[1] 
#order of the model  p.hat = ffpe[2] #lag of the model
(3) n = ncol(fdata1$coef)  D = nrow(fdata1$coef)  #center the data  mu = 
mean.fd(fdata1)  data = center.fd(fdata1)  #fPCA  fpca = pca.fd(data,nharm=D)  
scores = fpca$scores[,1:d.hat](4) # to avoid warnings from vars predict 
function below      colnames(scores) <- as.character(seq(1:d.hat))      
VAR.pre= predict(VAR(scores, p.hat), n.ahead=1, type="const")$fcst
after this I need help first how to transform this into original Functional 
data and to obtain the for for each 24 hours (mean one day forecast) and to how 
to generalize the result for one year.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help in Coding

2020-10-13 Thread Faheem Jan via R-help


Good morning dear administrators,
Please help me to code this code in R.
I working in the multivariate time series data, know my objective is that to 
one year forecast of the hourly time series data, using first five as a 
training set and the remaining one year as validation. For this  I transform 
the the data into functional data through Fourier basis functional, apply 
functional principle components as dimensional reduction explaining a specific 
amount of variation   , using the corresponding  functional principle 
components scores. I use the VAR model on those FPCscores for forecasting one 
day ahead forecast, know my problem is that i choose four Fpc,s which give only 
four value in a single day, I want the forecast for 24 hours not only 4, and 
then i want to transform it back to the original functional data. for the 
understanding i am sharing my code (1) transform of the multivariate time 
series data in functional data(2) the functional principle components and the 
corresponding scores(3) I use functional final prediction error for the 
selection of the parameters on the VAR model(4) Using VAR for the analysis and 
forecasting .(1) nb = 23 # number of basis functions for the data  fbf = 
create.fourier.basis(rangeval=c(0,1), nbasis=nb) # basis for data  
args=seq(0,1,length=24)  fdata1=Data2fd(args,y=t(mat),fbf) # functions 
generated from discretized y(2) ffpe = fFPE(fdata1, Pmax=10)  d.hat = ffpe[1] 
#order of the model  p.hat = ffpe[2] #lag of the model
(3) n = ncol(fdata1$coef)  D = nrow(fdata1$coef)  #center the data  mu = 
mean.fd(fdata1)  data = center.fd(fdata1)  #fPCA  fpca = pca.fd(data,nharm=D)  
scores = fpca$scores[,1:d.hat](4) # to avoid warnings from vars predict 
function below      colnames(scores) <- as.character(seq(1:d.hat))      
VAR.pre= predict(VAR(scores, p.hat), n.ahead=1, type="const")$fcst
after this iIneed help first how i transform this into original Functional data 
and to obtain the for for each 24 hours (mean one day forecast) and to how to 
generalize the result for one year.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Installing a Package

2020-10-11 Thread Faheem Jan via R-help



Hello, I am working in the nonparametric functional data analysis, i stack in a 
simple problem is that i am  going to install a package in the name of nfda 
which is not present in the  R Cran, know how i am going to install this 
package in R studio from archives or some thing else. please guide  me i am 
just a beginner in R, Thanks 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help in R code

2020-10-04 Thread Faheem Jan via R-help
Hello , i am working in the functional time series using themultivariate time 
series data(hourly time series data). Sir  i am usingFAR model more than one 
order for which no statistical package is available inR, so for this i convert 
my data into functional form and obtained thefunctional principle component and 
from those FPCA i extract theircorresponding  FPCscores. Know i use the VAR 
model on those FPCscores forthe forecasting of each 24 hours through the VAR 
model, but the VAR give me theforecasted value for all 23hours  when i put 
phat=23, but whenever i putphat=24 i.e want to predict each 24 hours its give 
the results in the form ofNA. the code is given below

 

fdata<- function(mat){

  nb = 27 # number of basis functions for the data

  fbf = create.fourier.basis(rangeval=c(0,1), nbasis=nb) #basis for data

  args=seq(0,1,length=24)

  fdata1=Data2fd(args,y=t(mat),fbf) # functions generatedfrom discretized y

  return(fdata1)

}

prediction.ffpe = function(fdata1){

  n = ncol(fdata1$coef)

  D = nrow(fdata1$coef)

  #center the data

  #mu = mean.fd(fdata1)

  data = center.fd(fdata1)

  #ffpe = fFPE(fdata1, Pmax=10)

  #p.hat = ffpe[2] #order of the model

  d.hat=23

  p.hat=6

  #fPCA

  fpca = pca.fd(data,nharm=D, centerfns=TRUE)

  scores = fpca$scores[,0:d.hat]

  # to avoid warnings from vars predict function below

  colnames(scores) <- as.character(seq(1:d.hat))

  VAR.pre= predict(VAR(scores, p.hat), n.ahead=1,type="const")$fcst

  

}

 

kindly guide me that how can i solve out my problem or whaterror i doing. THANKS


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help in R code

2020-10-02 Thread Faheem Jan via R-help


Hello , i am working in the functional time series using the multivariate time 
series data(hourly time series data). Sir  i am using FAR model more than one 
order for which no statistical package is available in R, so for this i convert 
my data into functional form and obtained the functional principle component 
and from those FPCA i extract their corresponding  FPCscores. Know i use the 
VAR model on those FPCscores for the forecasting of each 24 hours through the 
VAR model, but the VAR give me the forecasted value for all 23hours  when i put 
phat=23, but whenever i put phat=24 i.e want to predict each 24 hours its give 
the results in the form of NA. the code is given below
fdata<- function(mat){  nb = 27 # number of basis functions for the data  fbf = 
create.fourier.basis(rangeval=c(0,1), nbasis=nb) # basis for data  
args=seq(0,1,length=24)  fdata1=Data2fd(args,y=t(mat),fbf) # functions 
generated from discretized y  return(fdata1)}prediction.ffpe = 
function(fdata1){  n = ncol(fdata1$coef)  D = nrow(fdata1$coef)  #center the 
data  #mu = mean.fd(fdata1)  data = center.fd(fdata1)  #ffpe = fFPE(fdata1, 
Pmax=10)  #p.hat = ffpe[2] #order of the model  d.hat=23  p.hat=6  #fPCA  fpca 
= pca.fd(data,nharm=D, centerfns=TRUE)  scores = fpca$scores[,0:d.hat]  # to 
avoid warnings from vars predict function below  colnames(scores) <- 
as.character(seq(1:d.hat))  VAR.pre= predict(VAR(scores, p.hat), n.ahead=1, 
type="const")$fcst  }
kindly guide me that how can i solve out my problem or what error i doing. 
THANKS

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Would Like Some Advise

2020-08-29 Thread Jan Galkowski
 to the McAfee subscription my wife has for 
other systems in the house. 

Note, while R is my primary computational world, by far, I do run Anaconda 
Python 3 from time to time.  It can be useful for preparing data for 
consumption by R, given raw files, many with glitches and mistakes.  But with 
the data.table package and other packages in R, I'm finding that's less and 
less true. The biggest headache of Python is that you need to keep its 
libraries updated.  I also have used Python some times just to access 
MATPLOTLIB.  I prefer R, though, because, like MATLAB, its numerics are better 
than Python's NUMPY and SCIPY.

As I said, I don't know Mac at all well.  But I do know that, when Mac released 
a new version, somehow the colleagues about me would often degenerate into a 
couple of days of grumbling and meeting with each other about how they got past 
or around some stumbling point when updating their systems.  Otherwise people 
seem to like them a lot. 

I think all operating systems are deals with the Devil. It's what you put up 
with and deal with. 

As you can see, I opted to go the Windows route again, for probably the next 10 
years. 

YMMV.

 - Jan

On Sat, Aug 29, 2020, at 06:00, r-help-requ...@r-project.org wrote:
> From: "Philip" 
> To: "r-help" 
> Subject: [R] Would Like Some Advise
> Message-ID: <1157A76A248944878C040D1FE0AE725C@OWNERPC>
> Content-Type: text/plain; charset="utf-8"
> 
> I need a new computer.  have a friend who is convinced that I have an 
> aura about me that just kills electronic devices.
> 
> Does anyone out there have an opinion about Windows vs. Linux?  
> 
> I’m retired so this is just for my own enjoyment but I’m crunching some 
> large National Weather Service files and will move on to baseball data 
> and a few other things.  I’d like some advise about how much RAM and 
> stuff like that.  I understand there is something called zones of 
> computer memory. Can someone direct me to a good source so I can learn 
> more?   I really don’t understand stuff like this.  Does anyone think I 
> need to upgrade my wifi?
> 
> Thanks,
> Philip

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to read a file containing two types of rows - (for the Netflix challenge data format)

2020-01-31 Thread Jan Galkowski
With the *data.table* package, *R* can use *fread* as follows:

> grab<- function(file)
> {
>  fin<- fread(file=file,
>  sep=NULL,
>  dec=".",
>  quote="", nrows=Inf, header=FALSE,
>  stringsAsFactors=FALSE, verbose=FALSE,
>  col.names=c("record"),
>  check.names=FALSE, fill=FALSE, blank.lines.skip=FALSE,
>  showProgress=TRUE,
>  data.table=FALSE, skip=0,
>  nThread=2, logical01=FALSE, keepLeadingZeros=FALSE)
>  cat(sprintf("Read '%s'.\n", file))
>  #
>  substance<- apply(X=fin, MARGIN=1, FUN=function(r) chartr(",", "\t", r[1]))
>  cat(sprintf("Translated '%s'.\n", file))
>  D<- fread(text=substance,
>  sep="\t",
>  dec=".",
>  quote="", nrows=Inf, header=FALSE,
>  stringsAsFactors=FALSE, verbose=FALSE,
>  col.names=c("ip", "valid.hits", "err.hits", "megabytes"),
>  check.names=FALSE, fill=FALSE, blank.lines.skip=FALSE,
>  showProgress=TRUE,
>  data.table=FALSE, skip=0,
>  nThread=2, logical01=FALSE, keepLeadingZeros=FALSE)
>  cat(sprintf("Parsed '%s'.\n", file))
>  ip<- D$ip
>  withinBlock<- sapply(X=ip, FUN=function(s) as.integer((strsplit(x=s, 
> split=".", fixed=TRUE)[[1]])[4]))
>  D$within.block<- withinBlock
>  return(D)
> }
> 

In short, one pass pulls in all the records into an internal structure, which 
can be edited or manipulated at will, and then a second call to *fread* parses 
it properly. 

*fread *is fast, even for big datasets.


--
Jan Galkowski 

https://www.linkedin.com/in/deepdevelopment

member,

... American Statistical Association
... International Society for Bayesian Analysis
... Ecological Society of America
... International Association of Survey Statisticians
... American Association for the Advancement of Science
... TeX Users Group

(pronouns: *he, him, his*)

*Keep your energy local*. --John Farrell, *ILSR <http://ilsr.org/>*


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Extracting a particular column from list

2020-01-16 Thread Faheem Jan via R-help
Hi. How to extract a column from the list.. I will be thanks full.. 

Sent from Yahoo Mail on Android
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] spurious locking of packages

2019-12-27 Thread Jan Galkowski
I have been having a problem installing binary packages on Windows, since 3.6.x 
hit the streets.


I am using the
> 
> INSTALL_opts = c('--no-lock')
> 
option, but it occurs nevertheless. My habit is to install an update of R 
(latest, 3.6.2), then run update.packages(.):

> 
> trying URL 
> 'https://cran.cnr.berkeley.edu/bin/windows/contrib/3.6/zoib_1.5.4.zip'
> Content type 'application/zip' length 350788 bytes (342 KB)
> downloaded 342 KB
> 
> package ‘elasticnet’ successfully unpacked and MD5 sums checked
> package ‘ellipse’ successfully unpacked and MD5 sums checked
> package ‘elliptic’ successfully unpacked and MD5 sums checked
> package ‘EMCluster’ successfully unpacked and MD5 sums checked
> package ‘EMD’ successfully unpacked and MD5 sums checked
> Warning: cannot remove prior installation of package ‘EMD’
> Warning in file.copy(savedcopy, lib, recursive = TRUE) :
>  problem copying C:\Program 
> Files\R\R-2.13.1\library\00LOCK\EMD\libs\x64\EMD.dll to C:\Program 
> Files\R\R-2.13.1\library\EMD\libs\x64\EMD.dll: Permission denied
> Warning: restored ‘EMD’
> package ‘emdbook’ successfully unpacked and MD5 sums checked
> package ‘emdist’ successfully unpacked and MD5 sums checked
> package ‘emmeans’ successfully unpacked and MD5 sums checked
> package ‘emoa’ successfully unpacked and MD5 sums checked
> Error in unpackPkgZip(foundpkgs[okp, 2L], foundpkgs[okp, 1L], lib, libs_only, 
> :
>  ERROR: failed to lock directory ‘C:\Program Files\R\R-2.13.1\library’ for 
> modifying
> Try removing ‘C:\Program Files\R\R-2.13.1\library/00LOCK’
> >
> 


Note the above is preceded by a long list of packages which are, in each case, 
re-loaded from whatever repo at a mirror being used.

I have found the p_unlock() from package pacman useful. After assigning global 
variable P to the results of available.packages(), I repeatedly do:
> 
> > p_unlock()
> The following 00LOCK has been deleted:
> C:/Program Files/R/R-2.13.1/library/00LOCK
> > match(c("emoa"), P)
> [1] 13
> > P<- P[13:length(P)]
> > update.packages(method=NULL, ask=FALSE, checkBuilt=TRUE, type="win.binary", 
> > instPkgs=P,
> + dependencies=c("Imports", "Depends", "Suggests"), 
> INSTALL_opts=c("--no-lock"))
> 

where *emoa* is a stand-in for whatever package faulted during the load. (I 
also have no idea why *EMD* is locked in the above.)

My *sessionInfo()* is:

> > sessionInfo()
> R version 3.6.2 (2019-12-12)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
> Running under: Windows 7 x64 (build 7601) Service Pack 1
> 
> Matrix products: default
> 
> locale:
> [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 
> LC_MONETARY=English_United States.1252 LC_NUMERIC=C 
> [5] LC_TIME=English_United States.1252 
> 
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base 
> 
> loaded via a namespace (and not attached):
> [1] compiler_3.6.2
> >
> 

Eventually, I get to the end of P and call it done.

Anyone have a suggestion for an easier workaround?

 - Jan Galkowski


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Conversion of multivariate time series to functional time series

2019-09-19 Thread Faheem Jan via R-help
Hi, i am try to generalize the Functional autoregressive model of order one 
FAR(1) to FAR(p) through  functional principle component by  choosing a 
particular amount of variation, then using the functional scores of functional 
principle component  for the prediction of vector autoregressive model i.e 
VAR(p) time series through VAR package, now i want to transform these 
prediction into functional form, this can be done through karhunen loeve 
transformation but how i could do this R. Can any body help me in this regard

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fw: How to read a file saved in Rstudio

2019-09-12 Thread Faheem Jan via R-help



Subject: How to read a result  saved in Rstudio
 HI, i run the simulation result in other computer with high speed computer ,i 
save the result in the rda file. know i want to open this file rda file  in my 
laptop, the file loaded in my laptop , i got the error like this 
load("C:/Users/Khan/Downloads/Poly.Slow.100.kappa08 (1).rda")> 
Poly.Slow.100.kappa08 (1).rdaError: unexpected symbol in "Poly.Slow.100.kappa08 
(1).rda" can any one help me to resolve my issue. thanks alot in advance    
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Durbin- Levinson algorithm

2019-09-04 Thread Faheem Jan via R-help
hi, i have a problem that how i could use Durbin- levinson algorithm for 
prediction in case of multiple time series? how i could do this in R...

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Functional final prediction error

2019-09-04 Thread Faheem Jan via R-help
Hi, my question is related functional data analysis, what is functional final 
prediction error how we can use to transform vector autoregressive model to its 
functional form...need help in this regard i will be thanks ful..

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R 3.6.1 and apcluster package

2019-07-18 Thread Jan Galkowski
I have confirmed that a complete workaround to these problems is available if, 
as Bill Dunlap suggested, "version=2" is used in all *save* incantations. 

Thanks Bill!

 - Jan

On Thu, Jul 18, 2019, at 10:39, William Dunlap wrote:
> Note that you can reproduce this in R-3.5.1 if you specify serialization 
> version 3 (which became the default in 3.6.0).
> 
> > save(apresX, file="351-2.RData", version=2)
> > save(apresX, file="351-2.RData", version=3)
> Error: C stack usage 7969184 is too close to the limit
> > version$version.string
> [1] "R version 3.5.1 (2018-07-02)"
> 
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
> 
> 
> On Thu, Jul 18, 2019 at 12:46 AM Jan Galkowski  
> wrote:
>> > # Test for saving. Jan Galkowski, 17th July 2019.
>>  > # produceProtectionFault.R
>>  > 
>>  > library(apcluster)
>>  > cl1 <- cbind(rnorm(100, 0.2, 0.05), rnorm(100, 0.8, 0.06))
>>  > cl2 <- cbind(rnorm(50, 0.7, 0.08), rnorm(50, 0.3, 0.05))
>>  > x <- rbind(cl1, cl2)
>>  > 
>>  > ## compute similarity matrix and run affinity propagation
>>  > ## (p defaults to median of similarity)
>>  > simil<- negDistMat(x, r=2)
>>  > apres <- apcluster(s=simil, details=TRUE)
>>  > apresX<- aggExCluster(s=simil, x=apres)
>>  > 
>>  > show(apres)
>>  > show(apresX)
>>  > 
>>  > saveRDS(object=apresX, file="foo.rds", compress=TRUE)
>>  > 
>>  > #save(apresX, file="bar.data", compress=TRUE)
>>  > 
>>  > #save.image("crazy.RData")
>> 
>>  The example is from the apcluster documentation. Leaving any one of the 
>> "save"s uncommented produces said fault. 
>> 
>>  - Jan
>> 
>>  On Wed, Jul 17, 2019, at 08:18, Jeff Newmiller wrote:
>>  > It would never make sense for such messages to reflect normal and 
>> expected operation, so hypothesizing about intentionally changing stack 
>> behavior doesn't make sense.
>>  > 
>>  > The default format for saveRDS changed in 3.6.0. There may be bugs 
>> associated with that, but rolling back to 3.6.0 would just trade bugs.
>>  > 
>>  > https://cran.r-project.org/doc/manuals/r-devel/NEWS.html
>>  > 
>>  > On July 16, 2019 8:56:28 PM CDT, Jan Galkowski 
>>  wrote:
>>  > >Did something seriously change in R 3.6.1 at least for Windows in terms
>>  > >of stack impacts? 
>>  > >
>>  > >I'm encountering many problems with the 00UNLOCK, needing to disable
>>  > >locking during installations. 
>>  > >
>>  > >And I'm encountering 
>>  > >
>>  > >> Error: C stack usage 63737888 is too close to the limit
>>  > >
>>  > >for cases I did not before, even when all I'm doing is serializing an
>>  > >object to be saved with *saveRDS* or even *save.image(.)*. 
>>  > >
>>  > >Yes, I know, I did not append a minimally complete example. Just wanted
>>  > >to see if it was just me, or if anyone else was seeing this.
>>  > >
>>  > >It's on Windows 7 HE and I've run *R* here for years.
>>  > >
>>  > >My inclination is to drop back to 3.6.0 if it is just me or if no one
>>  > >knows about this problem. 
>>  > >
>>  > >Thanks,
>>  > >
>>  > > - Jan Galkowski.
>>  [snip]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R 3.6.1 and apcluster package

2019-07-18 Thread Jan Galkowski
> # Test for saving. Jan Galkowski, 17th July 2019.
> # produceProtectionFault.R
> 
> library(apcluster)
> cl1 <- cbind(rnorm(100, 0.2, 0.05), rnorm(100, 0.8, 0.06))
> cl2 <- cbind(rnorm(50, 0.7, 0.08), rnorm(50, 0.3, 0.05))
> x <- rbind(cl1, cl2)
> 
> ## compute similarity matrix and run affinity propagation
> ## (p defaults to median of similarity)
> simil<- negDistMat(x, r=2)
> apres <- apcluster(s=simil, details=TRUE)
> apresX<- aggExCluster(s=simil, x=apres)
> 
> show(apres)
> show(apresX)
> 
> saveRDS(object=apresX, file="foo.rds", compress=TRUE)
> 
> #save(apresX, file="bar.data", compress=TRUE)
> 
> #save.image("crazy.RData")

The example is from the apcluster documentation. Leaving any one of the "save"s 
uncommented produces said fault. 

 - Jan

On Wed, Jul 17, 2019, at 08:18, Jeff Newmiller wrote:
> It would never make sense for such messages to reflect normal and expected 
> operation, so hypothesizing about intentionally changing stack behavior 
> doesn't make sense.
> 
> The default format for saveRDS changed in 3.6.0. There may be bugs associated 
> with that, but rolling back to 3.6.0 would just trade bugs.
> 
> https://cran.r-project.org/doc/manuals/r-devel/NEWS.html
> 
> On July 16, 2019 8:56:28 PM CDT, Jan Galkowski  
> wrote:
> >Did something seriously change in R 3.6.1 at least for Windows in terms
> >of stack impacts? 
> >
> >I'm encountering many problems with the 00UNLOCK, needing to disable
> >locking during installations. 
> >
> >And I'm encountering 
> >
> >> Error: C stack usage 63737888 is too close to the limit
> >
> >for cases I did not before, even when all I'm doing is serializing an
> >object to be saved with *saveRDS* or even *save.image(.)*. 
> >
> >Yes, I know, I did not append a minimally complete example. Just wanted
> >to see if it was just me, or if anyone else was seeing this.
> >
> >It's on Windows 7 HE and I've run *R* here for years.
> >
> >My inclination is to drop back to 3.6.0 if it is just me or if no one
> >knows about this problem. 
> >
> >Thanks,
> >
> > - Jan Galkowski.
> >
> >
> > [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
> 
> -- 
> Sent from my phone. Please excuse my brevity.
> 

--
Jan Galkowski (o°)

607.239.1834 [mobile]
607.239.1834 [home]

bayesianlogi...@gmail.com
http://667-per-cm.net

member,

... American Statistical Association
... International Society for Bayesian Analysis
... Ecological Society of America
... International Association of Survey Statisticians
... American Association for the Advancement of Science
... TeX Users Group

(pronouns: *he, him, his*)

*Keep your energy local*. --John Farrell, *ILSR <http://ilsr.org/>*


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R 3.6.1

2019-07-17 Thread Jan Galkowski
Did something seriously change in R 3.6.1 at least for Windows in terms of 
stack impacts? 

I'm encountering many problems with the 00UNLOCK, needing to disable locking 
during installations. 

And I'm encountering 

> Error: C stack usage 63737888 is too close to the limit

for cases I did not before, even when all I'm doing is serializing an object to 
be saved with *saveRDS* or even *save.image(.)*. 

Yes, I know, I did not append a minimally complete example. Just wanted to see 
if it was just me, or if anyone else was seeing this.

It's on Windows 7 HE and I've run *R* here for years.

My inclination is to drop back to 3.6.0 if it is just me or if no one knows 
about this problem. 

Thanks,

 - Jan Galkowski.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] functional autoregressive model of order more than one

2019-07-09 Thread Faheem Jan via R-help
HI, hope all of you will be fine. i am working in functional time series 
analysis. i fit the functional autoregressive model using FAR package in R, 
Know i want to generalize our model. How i can do this in R please help in this 
regard. i will be thankful  

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] functional autoregressive model of order two

2019-05-09 Thread Faheem Jan via R-help
HI, i am functional data analysis, using times series data to which i fit the 
functional autoregressive model of order one using FAR package in R, know i 
want to extend my model to order two as it is observed that the FAR package can 
 only do for order one., can any body help me that how to extend my model to 
order two or suggest me any package which help to solve my problem. I will be 
thanks full in this regard. 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] fitting functional autoregresive model

2019-05-06 Thread Faheem Jan via R-help
Hi, i trying to  extend the functional autoregressive model one FAR(1) to fit 
the functional autoregressive model of order two FAR(2). the coding i do for 
far(1) is library(fda)library(far)# CREATE DUMMY 
VARIABLESfactor2dummy=function(x){  n=length(x)  tab=as.factor(names(table(x))) 
 p=length(tab)  xdummy=matrix(0,n,p)  for(i in 1:p)  {    xdummy[x==tab[i],i]=1 
 }  colnames(xdummy)=tab  return(xdummy)}
# READ DATA
demnd=read.csv("c:/Users/Khan/Desktop/dem99141.csv",header=TRUE)xdata <- 
as.matrix(demnd[-1,-1], ncol = 25, nrow =1826, byrow= 
TRUE)class(xdata)date=strptime(as.character(xdata[,1]),"%Y-%m-%d")weekday=weekdays(date)week=format(date,"%U")xdata=xdata[,-1]xdata#
 DAILY AVERAGExmean=apply(xdata,1,mean)xmean
# SEASONAL ADJUSTMENT#seasadj=function(x) 
decompose(ts(x,frequency=7))$rand#xdata=apply(xdata,2,seasadj)#xdata=xdata[!is.na(xdata[,1]),]nrall=nrow(xdata)#wd=factor2dummy(weekday)#wnr=factor2dummy(week)#e=lm(xmean~wd+wnr-1)$residuals#tsdiag(arima(e,c(7,0,0)))#seasadj=function(x)
 lm(x~wd+wnr-1)$residuals#xdata=apply(xdata,2,seasadj)
# HOLD-OUT-PERIODnout=100nin=nrall-noutxin=xdata[1:nin,]
nr=nrow(xin)nc=ncol(xin)n=nr*ncy=matrix(t(xin),n,1)xfd=as.fdata(y,col=1,p=23,name="Cons")
 #p=23 is the multiple of length=39698 of data
# ESTIMATE FAR(1) 
MODELk1=far.cv(xfd,y="Cons",kn=20,ncv=1000)$minL2[1]far1=far(xfd,kn=k1)far1# 
ESTIMATE AR(1)-MODELSp=14f=function(x) 
ar(x,aic=FALSE,order.max=p)ar.models=apply(xin,2,f)
# 
FORECASTerrorsfar=matrix(0,nout,nc)errorsar=matrix(0,nout,nc)errorsnaive=matrix(0,nout,nc)predar=matrix(0,1,nc)prednaive=matrix(0,1,nc)for(i
 in 1:nout){  for(j in 1:length(ar.models))  {    
predar[1,j]=predict(ar.models[[j]],newdata=xdata[(nr+i-p):(nr+i-1),j])$pred  }  
xnew=as.fdata(t(xdata[nr+i-1,]),col=1,p=23,name="Cons")  
pred=predict(far1,newdata=xnew)  prednaive=xdata[nr+i-1,]  obs=xdata[nr+i,]  
errorsnaive[i,]=t(obs-prednaive)  errorsar[i,]=t(obs-predar)  
errorsfar[i,]=t(obs-pred$Cons)}msefar=apply(errorsfar^2,2,mean)msefarmsear=apply(errorsar^2,2,mean)msenaive=apply(errorsnaive^2,2,mean)mean(msear)mean(msenaive)mean(msefar

i use the consumption data ...
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Bifactor model and infit statistics?

2019-04-11 Thread kende jan via R-help


Goodafternoon, 

I amcurrently in the process of calibrating an item bank using a GPCM model. 
So, amI right to assume that the bifactor model allows me to work with my 
generalfactor by assimilating it to a one-factor model, without taking into 
account groupfactors? That is, I can estimate my item parameters from my factor 
loadings onthe general factor only?

If so, Ihave some questions about evaluating the fit of my model. The 
calculation of infitstatistics is specific to unidimensional models. Can I 
compute infit statisticsusing the general factor or do I have to do this 
separately for each of the groupfactors? Or is there a more appropriate method 
to evaluate the fit of my modelwhen calibration an item bank using a GPCM model?

Thank youin advance.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] readxl::excel_sheets in tryCatch() doesn't catch error

2019-02-06 Thread Phillip-Jan van Zyl via R-help
Hi R programmers

I am reading multiple .xls and .xlsx files from a directory using readxl from 
tidyverse. When reading fails, the code should continue on to the next file.

However, when I call the custom function readExcelSheets (in a loop and with 
the tryCatch function) I get an error for some files and the code then stops 
executing. How can I force my code to continue on to the next files?

Here is the function:

readExcelSheets <- function(curPath) {
  out <- tryCatch(
{
  message("This is the 'try' part")
  dat <- excel_sheets(curPath)
},
error=function(cond) {
  message(paste("Error in opening Excel file with readxl read sheets:", 
curPath))
  message("Here's the original error message:")
  message(cond)
},
warning=function(cond) {
  message(paste("readxl caused a warning en reading sheets:", curPath))
  message("Here's the original warning message:")
  message(cond)
},
finally={
  message(paste("Processed file for sheets:", curPath))
  message("End of processing file for sheets.")
}
  )
  return(out)
}

The loop looks like this:

listLength <- length(excelList)
for (excel_file in excelList) {
  curPath <- excel_file
  sheetNames <- NULL
  sheetNames <- withTimeout({readExcelSheets(curPath)}, timeout = 5, 
onTimeout="silent")
  if(is.null(sheetNames)){next}
  for (sheetName in sheetNames){
# do something
  }
}

The problem is that I get an error:

Error: Evaluation error: zip file '' cannot be opened.

And then execution of the loop stops without progressing to the next Excel 
file. Note that for the first n=+-20 files the code works as expected. I think 
that there may be an error in the full path name (such as a text encoding 
error), but my point is that it should exit silently and progress to the next 
Excel file even if the path is not found.

Best regards
Phillip
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Function in default parameter value closing over variables defined later in the enclosing function

2019-01-23 Thread Jan T Kim via R-help
Hi Duncan,

On Wed, Jan 23, 2019 at 10:02:00AM -0500, Duncan Murdoch wrote:
> On 23/01/2019 5:27 a.m., Jan T Kim wrote:
> >Hi Ivan & All,
> >
> >R's scoping system basically goes to all environments along the call
> >stack when trying to resolve an unbound variable, see the language
> >definition [1], section 4.3.4, and perhaps also 2.1.5.
> 
> You are misinterpreting that section.  It's not the call stack that is
> searched, it's the chain of environments that starts with the evaluation
> frame of the current function.  Those are very different.

yes -- I meant the environment chain but mistakenly wrote "call stack",
sorry. Thanks for pointing this out.

Best regards, Jan


> For example,
> 
> 
> g <- function() {
>   print(secret)
> }
> 
> f <- function() {
>   secret <- "secret"
>   g()
> }
> 
> would fail, because even though secret is defined in the caller of g() and
> is therefore in the call stack, that's irrelevant:  it's not in g's
> evaluation frame (which has no variables) or its parent (which is the global
> environment if those definitions were evaluated there).
> 
> Duncan Murdoch
> 
> >
> >Generally, unbound variables should be used with care. It's a bit
> >difficult to decide whether and how the code should be rewritten,
> >I'd say that depends on the underlying intentions / purposes. As it
> >is, the code could be simplified to just
> >
> > print("secret");
> >
> >but that's probably missing the point.
> >
> >Best regards, Jan
> >
> >
> >[1] https://cran.r-project.org/doc/manuals/r-release/R-lang.html
> >
> >On Wed, Jan 23, 2019 at 12:53:01PM +0300, Ivan Krylov wrote:
> >>Hi!
> >>
> >>I needed to generalize a loss function being optimized inside another
> >>function, so I made it a function argument with a default value. It
> >>worked without problems, but later I noticed that the inner function,
> >>despite being defined in the function arguments, somehow closes over a
> >>variable belonging to the outer function, which is defined later.
> >>
> >>Example:
> >>
> >>outside <- function(inside = function() print(secret)) {
> >>secret <- 'secret'
> >>inside()
> >>}
> >>outside()
> >>
> >>I'm used to languages that have both lambdas and variable declaration
> >>(like perl5 -Mstrict or C++11), so I was a bit surprised.
> >>
> >>Does this work because R looks up the variable by name late enough at
> >>runtime for the `secret` variable to exist in the parent environment of
> >>the `inside` function? Can I rely on it? Is this considered bad style?
> >>Should I rewrite it (and how)?
> >>
> >>-- 
> >>Best regards,
> >>Ivan
> >>
> >>__
> >>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>https://stat.ethz.ch/mailman/listinfo/r-help
> >>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> >>and provide commented, minimal, self-contained, reproducible code.
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
> >
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Function in default parameter value closing over variables defined later in the enclosing function

2019-01-23 Thread Jan T Kim
Hi Ivan & All,

R's scoping system basically goes to all environments along the call
stack when trying to resolve an unbound variable, see the language
definition [1], section 4.3.4, and perhaps also 2.1.5.

Generally, unbound variables should be used with care. It's a bit
difficult to decide whether and how the code should be rewritten,
I'd say that depends on the underlying intentions / purposes. As it
is, the code could be simplified to just

print("secret");

but that's probably missing the point.

Best regards, Jan


[1] https://cran.r-project.org/doc/manuals/r-release/R-lang.html

On Wed, Jan 23, 2019 at 12:53:01PM +0300, Ivan Krylov wrote:
> Hi!
> 
> I needed to generalize a loss function being optimized inside another
> function, so I made it a function argument with a default value. It
> worked without problems, but later I noticed that the inner function,
> despite being defined in the function arguments, somehow closes over a
> variable belonging to the outer function, which is defined later.
> 
> Example:
> 
> outside <- function(inside = function() print(secret)) {
>   secret <- 'secret'
>   inside()
> }
> outside()
> 
> I'm used to languages that have both lambdas and variable declaration
> (like perl5 -Mstrict or C++11), so I was a bit surprised.
> 
> Does this work because R looks up the variable by name late enough at
> runtime for the `secret` variable to exist in the parent environment of
> the `inside` function? Can I rely on it? Is this considered bad style? 
> Should I rewrite it (and how)?
> 
> -- 
> Best regards,
> Ivan
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random projection

2018-12-05 Thread Jan Galkowski
Ms Fleming,

This blog post (of mine) may be of interest and hopefully of help:

https://667-per-cm.net/2018/11/20/the-johnson-lindenstrauss-lemma-and-the-paradoxical-power-of-random-linear-operators-part-1/
Cheers!

--
Jan Galkowski (o°)
Westwood, MA

(pronouns: *he, him, his*)

*Keep your energy local*.  --John Farrell, *ILSR[1]*



Links:

  1. http://ilsr.org

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] saveRDS() and readRDS() Why?

2018-11-07 Thread Jan van der Laan



Are you sure you didn't do saveRDS("rawData", file = "rawData.rds") 
instead of saveRDS(rawData, file = "rawData.rds") ? This would explain 
the result you have under linux.


In principle saveRDS and readRDS can be used to copy objects between 
R-sessions without loosing information.


What does readRDS return on windows with the same file?

What type of object is rawData? Do str(rawData). Some objects created by 
packages cannot be serialized, e.g. objects that point to memory 
allocated by a package. The pointer is then serialized not the memory 
pointed to.


Also, if the object is generated by a package, you might need to load 
the package to get the printing etc. of the object right.


HTH,

Jan







On 07-11-18 09:45, Patrick Connolly wrote:

On Wed, 07-Nov-2018 at 08:27AM +, Robert David Burbidge wrote:

|> Hi Patrick,
|>
|> From the help: "save writes a single line header (typically
|> "RDXs\n") before the serialization of a single object".
|>
|> If the file sizes are the same (see Eric's message), then the
|> problem may be due to different line terminators. Try serialize and
|> unserialize for low-level control of saving/reading objects.

I'll have to find out what 'serialize' means.

On Windows, it's a huge table, looks like it's all hexadecimal.

On Linux, it's just the text string 'rawData' -- a lot more than line
terminators.

Have I misunderstood what the idea is?  I thought I'd get an identical
object, irrespective of how different the OS stores and zips it.



|>
|> Rgds,
|>
|> Robert
|>
|>
|> On 07/11/18 08:13, Eric Berger wrote:
|> >What do you see at the OS level?
|> >i.e. on windows
|> >DIR rawData.rds
|> >on linux
|> >ls -l rawData.rds
|> >compare the file sizes on both.
|> >
|> >
|> >On Wed, Nov 7, 2018 at 9:56 AM Patrick Connolly 
|> >wrote:
|> >
|> >> From a Windows R session, I do
|> >>
|> >>>object.size(rawData)
|> >>31736 bytes  # from scraping a non-reproducible web address.
|> >>>saveRDS(rawData, file = "rawData.rds")
|> >>Then copy to a Linux session
|> >>
|> >>>rawData <- readRDS(file = "rawData.rds")
|> >>>rawData
|> >>[1] "rawData"
|> >>>object.size(rawData)
|> >>112 bytes
|> >>>rawData
|> >>[1] "rawData" # only the name and something to make up 112 bytes
|> >>Have I misunderstood the syntax?
|> >>
|> >>It's an old version on Windows.  I haven't used Windows R since then.
|> >>
|> >>major  3
|> >>minor  2.4
|> >>year   2016
|> >>month  03
|> >>day16
|> >>
|> >>
|> >>I've tried R-3.5.0 and R-3.5.1 Linux versions.
|> >>
|> >>In case it's material ...
|> >>
|> >>I couldn't get the scraping to work on either of the R installations
|> >>but Windows users told me it worked for them.  So I thought I'd get
|> >>the R object and use it.  I could understand accessing the web address
|> >>could have different permissions for different OSes, but should that
|> >>affect the R objects?
|> >>
|> >>TIA
|> >>
|> >>--
|> >>~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
|> >>___Patrick Connolly
|> >>  {~._.~}   Great minds discuss ideas
|> >>  _( Y )_ Average minds discuss events
|> >>(:_~*~_:)  Small minds discuss people
|> >>  (_)-(_)  . Eleanor Roosevelt
|> >>
|> >>~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
|> >>
|> >>__
|> >>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
|> >>https://stat.ethz.ch/mailman/listinfo/r-help
|> >>PLEASE do read the posting guide
|> >>http://www.R-project.org/posting-guide.html
|> >>and provide commented, minimal, self-contained, reproducible code.
|> >>
|> >  [[alternative HTML version deleted]]
|> >
|> >__
|> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
|> >https://stat.ethz.ch/mailman/listinfo/r-help
|> >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
|> >and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot a path

2018-11-01 Thread Jan van der Laan
Below a similar example, using sf and leaflet; plotting the trajectory 
on a background map.



library(leaflet)
library(sf)
library(dplyr)

# Generate example data
gen_data <- function(id, n) {
  data.frame(
id = id,
date = 1:n,
lat = runif(10, min = -90, max = 90),
lon = runif(10, min = -180, max = 180)
  )
}

dta <- lapply(1:2, gen_data, n = 10) %>% bind_rows()

# Transform all records of one object/person to a st_linestring, then
# combine into one sf column
lines <- dta %>%
  arrange(id, date) %>%
  split(dta$id) %>%
  lapply(function(d) st_linestring(cbind(d$lon, d$lat))) %>%
  unname() %>%   # Without the unname it doesn't work for some reason
  st_sfc()

# Plot using leaflet
leaflet() %>%
  addTiles() %>%
  addPolylines(data = lines)


HTH - Jan


On 01-11-18 11:27, Rui Barradas wrote:

Hello,

The following uses ggplot2.

First, make up a dataset, since you have not posted one.



lat0 <- 38.736946
lon0 <- -9.142685
n <- 10

set.seed(1)
Date <- seq(Sys.Date() - n + 1, Sys.Date(), by = "days")
Lat <- lat0 + cumsum(c(0, runif(n - 1)))
Lon <- lon0 + cumsum(c(0, runif(n - 1)))
Placename <- rep(c("A", "B"), n/2)

path <- data.frame(Date, Placename, Lat, Lon)
path <- path[order(path$Date), ]


Now, two graphs, one with just one line of all the lon/lat and the other 
with a line for each Placename.


library(ggplot2)

ggplot(path, aes(x = Lon, y = Lat)) +
   geom_point() +
   geom_line()


ggplot(path, aes(x = Lon, y = Lat, colour = Placename)) +
   geom_point(aes(fill = Placename)) +
   geom_line()


Hope this helps,

Rui Barradas

Às 21:27 de 31/10/2018, Ferri Leberl escreveu:


Dear All,
I have a dataframe with four cols: Date, Placename, geogr. latitude, 
geogr. longitude.
How can I plot the path as a line, ordered by the date, with the 
longitude as the x-axis and the latitude as the y-axis?

Thank you in advance!
Yours, Ferri

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating just a single row of dissimilarity/distance matrix

2018-10-27 Thread Jan van der Laan

Please respond to the list; there are more people answering there.


As explained in the documentation gower_dist performes a pairwise 
comparison of the two arguments recycling the shortest one if needed, so 
indeed gower_dist(iris[1:5, ], iris) doesn't do what you want.


Possible solutions are:

tmp <- split(iris[1:150, ], seq_len(150))

sapply(gower_dist, iris)


and:


library(dplyr)

library(tidyr)

pairs <- expand.grid(x = 1:5, y = 1:nrow(iris))
pairs$dist <- gower_dist(iris[pairs$x, ], iris[pairs$y, ])
pairs %>% spread(y, dist)

Don't know which one is faster. And there are probably various other 
solutions too.


--
Jan





On 27-10-18 18:04, Aerenbkts bkts wrote:

Dear Jan

Thanks for your help. Actually it works for the first element. But I 
tried to calculate distance values for the first N rows. For example;


gower_dist(iris[1:5,], iris) // gower distance for the first 5 rows. 
but it did not work. Do you have any suggestion about it?




On Fri, 26 Oct 2018 at 21:31, Jan van der Laan <mailto:rh...@eoos.dds.nl>> wrote:



Using another implementation of the gower distance:


library(gower)

gower_dist(iris[1,], iris)


HTH,

Jan



On 26-10-18 15:07, Aerenbkts bkts wrote:
> I have a data-frame with 30k rows and 10 features. I would like to
> calculate distance matrix like below;
>
> gower_dist <- daisy(data-frame, metric = "gower"),
>
>
> This function returns whole dissimilarity matrix. I want to get just
> the first row.
> (Just distances of the first element in data-frame). How can I
do it?
> Do you have an idea?
>
>
> Regards
>
>       [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org <mailto:R-help@r-project.org> mailing list
-- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org <mailto:R-help@r-project.org> mailing list --
To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating just a single row of dissimilarity/distance matrix

2018-10-26 Thread Jan van der Laan



Using another implementation of the gower distance:


library(gower)

gower_dist(iris[1,], iris)


HTH,

Jan



On 26-10-18 15:07, Aerenbkts bkts wrote:

I have a data-frame with 30k rows and 10 features. I would like to
calculate distance matrix like below;

gower_dist <- daisy(data-frame, metric = "gower"),


This function returns whole dissimilarity matrix. I want to get just
the first row.
(Just distances of the first element in data-frame). How can I do it?
Do you have an idea?


Regards

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bug : Autocorrelation in sample drawn from stats::rnorm (hmh)

2018-10-05 Thread Annaert Jan
On 05/10/2018, 09:45, "R-help on behalf of hmh"  wrote:

Hi,

Thanks William for this fast answer, and sorry for sending the 1st mail 
to r-help instead to r-devel.


I noticed that bug while I was simulating many small random walks using 
c(0,cumsum(rnorm(10))). Then the negative auto-correlation was inducing 
a muchsmaller space visited by the random walks than expected if there 
would be no auto-correlation in the samples.


The code I provided and you optimized was only provided to illustrated 
and investigate that bug.


It is really worrying that most of the R distributions are affected by 
this bug 

What I did should have been one of the first check done for _*each*_ 
distributions by the developers of these functions !


And if as you suggested this is a "tolerated" _error_ of the algorithm, 
I do think this is a bad choice, but any way, this should have been 
mentioned in the documentations of the functions !!


cheers,

hugo
 
This is not a bug. You have simply rediscovered the finite-sample bias in the 
sample autocorrelation coefficient, known at least since
Kendall, M. G. (1954). Note on bias in the estimation of autocorrelation. 
Biometrika, 41(3-4), 403-404. 

The bias is approximately -1/T, with T sample size, which explains why it seems 
to disappear in the larger sample sizes you consider.

Jan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bug : Autocorrelation in sample drawn from stats::rnorm

2018-10-04 Thread Annaert Jan

Did you take into account that the sample serial correlation coefficient has a 
bias of approximately -1/T (with T the sample size)? Its variance is 
approximately 1/T.
Jan Annaert



-Original Message-
From: R-help  On Behalf Of hmh
Sent: donderdag 4 oktober 2018 12:09
To: R 
Subject: [R] Bug : Autocorrelation in sample drawn from stats::rnorm

Hi,


I just noticed the following bug:

When we draw a random sample using the function stats::rnorm, there should be 
not auto-correlation in the sample. But their is some auto-correlation _when 
the sample that is drawn is small_.

I describe the problem using two functions:

DistributionAutocorrelation_Unexpected which as the wrong behavior : 
_when drawing some small samples using rnorm, there is generally a strong 
negative auto-correlation in the sample_.

and

DistributionAutocorrelation_Expected which illustrate the expected behavior



*Unexpected : *

DistributionAutocorrelation_Unexpected = function(SampleSize){
   Cor = NULL
   for(repetition in 1:1e5){
     X = rnorm(SampleSize)
     Cor[repetition] = cor(X[-1],X[-length(X)])
   }
   return(Cor)
}

par(mfrow=c(3,3))
for(SampleSize_ in c(4,5,6,7,8,10,15,20,50)){
hist(DistributionAutocorrelation_Unexpected(SampleSize_),col='grey',main=paste0('SampleSize=',SampleSize_))
; abline(v=0,col=2)
}

output:


*Expected**:*

DistributionAutocorrelation_Expected = function(SampleSize){
   Cor = NULL
   for(repetition in 1:1e5){
     X = rnorm(SampleSize)
*    Cor[repetition] = cor(sample(X[-1]),sample(X[-length(X)]))*
   }
   return(Cor)
}

par(mfrow=c(3,3))
for(SampleSize_ in c(4,5,6,7,8,10,15,20,50)){
hist(DistributionAutocorrelation_Expected(SampleSize_),col='grey',main=paste0('SampleSize=',SampleSize_))
; abline(v=0,col=2)
}




Some more information you might need:


packageDescription("stats")
Package: stats
Version: 3.5.1
Priority: base
Title: The R Stats Package
Author: R Core Team and contributors worldwide
Maintainer: R Core Team 
Description: R statistical functions.
License: Part of R 3.5.1
Imports: utils, grDevices, graphics
Suggests: MASS, Matrix, SuppDists, methods, stats4
NeedsCompilation: yes
Built: R 3.5.1; x86_64-pc-linux-gnu; 2018-07-03 02:12:37 UTC; unix

Thanks for correcting that.

fill free to ask any further information you would need.

cheers,

hugo


-- 
- no title specified

Hugo Mathé-Hubert

ATER

Laboratoire Interdisciplinaire des Environnements Continentaux (LIEC)

UMR 7360 CNRS -  Bât IBISE

Université de Lorraine  -  UFR SciFA

8, Rue du Général Delestraint

F-57070 METZ

+33(0)9 77 21 66 66
- - - - - - - - - - - - - - - - - -
Les réflexions naissent dans les doutes et meurent dans les certitudes. 
Les doutes sont donc un signe de force et les certitudes un signe de 
faiblesse. La plupart des gens sont pourtant certains du contraire.
- - - - - - - - - - - - - - - - - -
Thoughts appear from doubts and die in convictions. Therefore, doubts 
are an indication of strength and convictions an indication of weakness. 
Yet, most people believe the opposite.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Erase content of dataframe in a single stroke

2018-09-27 Thread Jan van der Laan

Or

testdf <- testdf[FALSE, ]

or

testdf <- testdf[numeric(0), ]

which seems to be slightly faster.

Best,
Jan


Op 27-9-2018 om 10:32 schreef PIKAL Petr:

Hm

I would use


testdf<-data.frame(A=c(1,2),B=c(2,3),C=c(3,4))
str(testdf)

'data.frame':   2 obs. of  3 variables:
  $ A: num  1 2
  $ B: num  2 3
  $ C: num  3 4

testdf<-testdf[-(1:nrow(testdf)),]
str(testdf)

'data.frame':   0 obs. of  3 variables:
  $ A: num
  $ B: num
  $ C: num

Cheers
Petr


-Original Message-
From: R-help  On Behalf Of Jim Lemon
Sent: Thursday, September 27, 2018 10:12 AM
To: Luigi Marongiu ; r-help mailing list 
Subject: Re: [R] Erase content of dataframe in a single stroke

Ah, yes, try 'as.data.frame" on it.

Jim

On Thu, Sep 27, 2018 at 6:00 PM Luigi Marongiu 
wrote:

Thank you Jim,
this requires the definition of an ad hoc function; strange that R
does not have a function for this purpose...
Anyway, it works but it changes the structure of the data. By
redefining the dataframe as I did, I obtain:


df

[1] A B C
<0 rows> (or 0-length row.names)

str(df)

'data.frame': 0 obs. of  3 variables:
  $ A: num
  $ B: num
  $ C: num

When applying your function, I get:


df

$A
NULL

$B
NULL

$C
NULL


str(df)

List of 3
  $ A: NULL
  $ B: NULL
  $ C: NULL

The dataframe has become a list. Would that affect downstream

applications?

Thank you,
Luigi
On Thu, Sep 27, 2018 at 9:45 AM Jim Lemon 

wrote:

Hi Luigi,
Maybe this:

testdf<-data.frame(A=1,B=2,C=3)

testdf

  A B C
1 1 2 3
toNull<-function(x) return(NULL)
testdf<-sapply(testdf,toNull)

Jim
On Thu, Sep 27, 2018 at 5:29 PM Luigi Marongiu

 wrote:

Dear all,
I would like to erase the content of a dataframe -- but not the
dataframe itself -- in a simple and fast way.
At the moment I do that by re-defining the dataframe itself in this way:


df <- data.frame(A = numeric(),

+   B = numeric(),
+   C = character())

# assign
A <- 5
B <- 0.6
C <- 103
# load
R <- cbind(A, B, C)
df <- rbind(df, R)
df

   A   B   C
1 5 0.6 103

# erase
df <- data.frame(A = numeric(),

+  B = numeric(),
+  C = character())

df

[1] A B C
<0 rows> (or 0-length row.names)
Is there a way to erase the content of the dataframe in a simplier
(acting on all the dataframe at once instead of naming each column
individually) and nicer (with a specific erasure command instead
of re-defyining the object itself) way?

Thank you.
--
Best regards,
Luigi

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Best regards,
Luigi

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních 
partnerů PRECHEZA a.s. jsou zveřejněny na: 
https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about 
processing and protection of business partner’s personal data are available on 
website: https://www.precheza.cz/en/personal-data-protection-principles/
Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a 
podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: 
https://www.precheza.cz/01-dovetek/ | This email and any documents attached to 
it may be confidential and are subject to the legally binding disclaimer: 
https://www.precheza.cz/en/01-disclaimer/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reading data problem

2018-09-24 Thread Jan T Kim via R-help
hmm... I don't see the quote="" paraneter in your read.csv call


Best regards, Jan
--
Sent from my mobile. Apologies for typos and terseness

On Mon, Sep 24, 2018, 20:40 greg holly  wrote:

> Hi Jan;
>
> Thanks so much for this. Yes, I did. Her is my code to read
> data: a<-read.csv("for_R_graphs.csv", header=T, sep=",")
>
> On Mon, Sep 24, 2018 at 2:07 PM Jan T Kim via R-help 
> wrote:
>
>> Yet one more: have you tried adding quote="" to your read.table
>> parameters? Quote characters have a 50% chance of being balanced,
>> and they can encompass multiple lines...
>>
>> On Mon, Sep 24, 2018 at 11:40:47AM -0700, Bert Gunter wrote:
>> > One more question:
>> >
>> > 5. Have you tried shutting down, restarting R, and rereading?
>> >
>> > -- Bert
>> >
>> > On Mon, Sep 24, 2018 at 11:36 AM Bert Gunter 
>> wrote:
>> >
>> > > *Perhaps* useful questions (perhaps *not*, though):
>> > >
>> > > 1. What is your OS? What is your R version?
>> > > 2. How do you know that your data has 151 rows?
>> > > 3. Are there stray characters -- perhaps a stray eof -- in your data?
>> Have
>> > > you checked around row 96 to see what's there?
>> > > 4. Are the data you did get in R what you expect?
>> > >
>> > > -- Bert
>> > >
>> > > Bert Gunter
>> > >
>> > > "The trouble with having an open mind is that people keep coming
>> along and
>> > > sticking things into it."
>> > > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>> > >
>> > >
>> > > On Mon, Sep 24, 2018 at 11:27 AM greg holly 
>> wrote:
>> > >
>> > >> Hi Dear all;
>> > >>
>> > >> I have a dataset with 151*291 dimension. After making data read into
>> R I
>> > >> am
>> > >> getting a data with 96*291 dimension. Even though  I have no error
>> message
>> > >> from R I could not understand the reason why I cannot get data
>> correctly?
>> > >>
>> > >> Here are my codes to make read the data
>> > >> a<-read.table("for_R_graphs.csv", header=T, sep=",")
>> > >> a<-read.table("for_R_graphs.txt", header=T, sep="\t")
>> > >>
>> > >> Regards,
>> > >>
>> > >> Greg
>> > >>
>> > >> [[alternative HTML version deleted]]
>> > >>
>> > >> __
>> > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > >> https://stat.ethz.ch/mailman/listinfo/r-help
>> > >> PLEASE do read the posting guide
>> > >> http://www.R-project.org/posting-guide.html
>> > >> and provide commented, minimal, self-contained, reproducible code.
>> > >>
>> > >
>> >
>> >   [[alternative HTML version deleted]]
>> >
>> > __
>> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reading data problem

2018-09-24 Thread Jan T Kim via R-help
Yet one more: have you tried adding quote="" to your read.table
parameters? Quote characters have a 50% chance of being balanced,
and they can encompass multiple lines...

On Mon, Sep 24, 2018 at 11:40:47AM -0700, Bert Gunter wrote:
> One more question:
> 
> 5. Have you tried shutting down, restarting R, and rereading?
> 
> -- Bert
> 
> On Mon, Sep 24, 2018 at 11:36 AM Bert Gunter  wrote:
> 
> > *Perhaps* useful questions (perhaps *not*, though):
> >
> > 1. What is your OS? What is your R version?
> > 2. How do you know that your data has 151 rows?
> > 3. Are there stray characters -- perhaps a stray eof -- in your data? Have
> > you checked around row 96 to see what's there?
> > 4. Are the data you did get in R what you expect?
> >
> > -- Bert
> >
> > Bert Gunter
> >
> > "The trouble with having an open mind is that people keep coming along and
> > sticking things into it."
> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> >
> > On Mon, Sep 24, 2018 at 11:27 AM greg holly  wrote:
> >
> >> Hi Dear all;
> >>
> >> I have a dataset with 151*291 dimension. After making data read into R I
> >> am
> >> getting a data with 96*291 dimension. Even though  I have no error message
> >> from R I could not understand the reason why I cannot get data correctly?
> >>
> >> Here are my codes to make read the data
> >> a<-read.table("for_R_graphs.csv", header=T, sep=",")
> >> a<-read.table("for_R_graphs.txt", header=T, sep="\t")
> >>
> >> Regards,
> >>
> >> Greg
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R shared library (/usr/lib64/R/lib/libR.so) not found.

2018-08-23 Thread Jan T Kim via R-help
Hi Rolf & All,

I haven't built R in a while, but my general expectation of an
autotools based build & install would be that the default prefix
is /usr/local, rather than /usr. So I'd expect the shared libs
in /usr/local/lib, /usr/local/lib64 etc.

I also have a recollection that I once installed Rstudio for some
MOOC, and ended up putting a symlink in somewhere like /usr/lib* ,
because Rstudio was only available as a binary with the location
of the shared lib hard-baked into it.

Depending on your  this may be irrelevant, apologies
in that case.

Best regards, Jan


On Thu, Aug 23, 2018 at 10:57:35PM +1200, Rolf Turner wrote:
> 
> I *think* that this is an R question (and *not* an RStudio question!)
> 
> I have, somewhat against my better judgement, decided to experiment with
> using RStudio.
> 
> I downloaded and install RStudio.  Easy-peasy.  Nice lucid instructions.
> 
> Then I tried to start RStudio ("rstudio" from the command line)
> and got a pop-up window with the error message:
> 
> >R shared library (/usr/lib64/R/lib/libR.so) not found. If this
> >is a custom build of R, was it built with the --enable-R-shlib option?
> 
> Oops, no, I guess it wasn't.  So I carefully did a
> 
> sudo make uninstall
> make clean
> make distclean
> 
> and then did
> 
> ./R-3.5.1/configure 
> 
> making sure I added the --enable-R-shlib flag.
> 
> Then I did make and sudo make install. It all seemed to go ...
> but then I did
> 
> rstudio
> 
> again and got the same popup error.
> 
> There is indeed *no* libR.so in /usr/lib64/R/lib.
> 
> There *is* a libR.so in /usr/lib/R/lib, but (weirdly) ls -l reveals that it
> dates from the my previous install of R-3.5.1 for which I *did not*
> configure with --enable-R-shlib.
> 
> Can anyone explain to me WTF is going on?
> 
> What should I do?  Just make a symbolic link from /usr/lib/R/lib/libR.so to
> /usr/lib64/R/lib/libR.so?
> 
> It bothers me that /usr/lib/R/lib/libR.so was not "refreshed" from my
> most recent install of R.
> 
> I plead for enlightenment.
> 
> cheers,
> 
> Rolf Turner
> 
> P.S. I'm running Ubuntu 18.04.  And the previous install of R was done under
> Ubuntu 18.04.
> 
> R. T.
> 
> -- 
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] security using R at work

2018-08-09 Thread Jan van der Laan
You can also inadvertently transmit data to the internet using a package 
without being obviously 'stupid', e.g. by using a package that uses an 
external service for data processing. For example, some javascript 
visualisation libs can do that (not sure if those wrapped in R-packages 
do), or, for example, a geocoding service.


Not having an (outgoing) internet connection at least helps against 
mistakes like this (and probably against many untargeted attacks). If it 
is allowed to have the sensitive data on that computer, using R on that 
computer is probably not going to make is less safe.


Jan


On 09-08-18 09:19, Rainer M Krug wrote:

I can not agree more, Barry. Very nicely put.

Rainer



On 8 Aug 2018, at 18:10, Barry Rowlingson  wrote:

On Wed, Aug 8, 2018 at 4:09 PM, Laurence Clark
 wrote:

Hello all,

I want to download R and use it for work purposes. I hope to use it to analyse 
very sensitive data from our clients.

My question is:

If I install R on my work network computer, will the data ever leave our 
network? I need to know if the data goes anywhere other than our network, 
because this could compromise it's security.



Is there is any chance the data could go to a server owned by 'R' or anything 
else that's not immediately obvious, but constitutes the data leaving our 
network?


You are talking mostly to statisticians here, and if p>0 then there's
"a chance". I'd say yes, there's a chance, but its pretty small, and
would only occur through stupidity, accident or malice.

In the ordinary course of things your data will be on your hard disk,
or on your corporate network drives, and only exist between your
corporate network server and your PC's memory. R will load the data
into that memory, do stuff with it in that memory, and write results
back to hard disk. Nothing leaves the network this way.

However... R has facilities for talking to the internet. You can save
data to google docs spreadsheets, for example, but you'd have to be
signed in to google, and have to type something like:


writeGoogleDoc(my_data, "secretdata.xls")


that covers "stupid". You should know that google docs are on google's
servers, and google's servers aren't on your network, and your secret
data shouldn't go on google's servers.

Accidents happen. You might be working on non-secret data which you
want to save to google docs, and accidentally save "data1" which is
secret instead of "data2" which is okay to be public. Oops. You sent
it to google. Accidents happen.

"malice" would be if someone had put code into R or an add-on package
that you use that sends your data over the network without you
knowing. For example maybe every time you fit a linear model with:

lm(age~beauty, data=people)

R could be transmitting the data to hackers. But the chance of this is
very small, and I don't think any malicious code has ever been
discovered in R or the 12000 add-on packages downloadable from CRAN.
Doesn't mean it hasn't been discovered yet or won't be in the future.

It used to be said that the only machine safe from hackers was one
unplugged from the network. But now hackers can get to your machine
via malicious USB sticks, keyboard loggers, and various other nasties.
The only machine safe from hackers is one with the power off. But take
the power plug out because a wake-on-lan packet could switch your
machine on remotely

Barry








Thank you

Laurence


--
Laurence Clark
Business Data Analyst
Account Management
Health Management Ltd

Mobile: 07584 556498
Switchboard:0845 504 1000
Email:  laurence.cl...@healthmanltd.com
Web:www.healthmanagement.co.uk

--
CONFIDENTIALITY NOTICE: This email, including attachments, is for the sole use of the 
intended recipients and may contain confidential and privileged information or otherwise be 
protected by law. Any unauthorised review, use, disclosure or distribution is prohibited. 
If you are not the intended recipient, please contact the sender, and destroy all copies 
and the original message.MAXIMUS People Services Limited is registered 
in England and Wales (registered number: 03752300); registered office: 202 - 206 Union 
Street, London, SE1 0LX, United Kingdom. The Centre for Health and Disability Assessments 
Ltd (registered number: 9072343) and Health Management Ltd (registered number: 4369949) are 
registered in England and Wales. The registered office for each is Ash House, The Broyle, 
Ringmer, East Sussex, BN8 5NN, United Kingdom. Remploy Limited is registered in England and 
Wales (registered number: 09457025)

Re: [R] F-test where the coefficients in the H_0 is nonzero

2018-08-03 Thread Annaert Jan
You can easily test linear restrictions using the function linearHypothesis() 
from the car package.
There are several ways to set up the null hypothesis, but a straightforward one 
here is:
 
> library(car)
> x <- rnorm(10)
> y <- x+rnorm(10)
> linearHypothesis(lm(y~x), c("(Intercept)=0", "x=1"))
Linear hypothesis test

Hypothesis:
(Intercept) = 0
x = 1

Model 1: restricted model
Model 2: y ~ x

  Res.Df RSS Df Sum of Sq  F Pr(>F)
1 10 10.6218   
2  8  9.0001  21.6217 0.7207 0.5155


Jan

From: R-help  on behalf of John 

Date: Thursday, 2 August 2018 at 10:44
To: r-help 
Subject: [R] F-test where the coefficients in the H_0 is nonzero

Hi,

   I try to run the regression
   y = beta_0 + beta_1 x
   and test H_0: (beta_0, beta_1) =(0,1) against H_1: H_0 is false
   I believe I can run the regression
   (y-x) = beta_0 +beta_1‘ x
   and do the regular F-test (using lm functio) where the hypothesized
coefficients are all zero.

   Is there any function in R that deal with the case where the
coefficients are nonzero?

John

[[alternative HTML version deleted]]

__
mailto:R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (no subject)

2018-07-10 Thread Werning, Jan-Philipp

Hi,

thanks a lot! Now it works.

Yours

Jan

Am 10.07.2018 um 09:00 schrieb PIKAL Petr 
mailto:petr.pi...@precheza.cz>>:

Hi

see in line

-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Werning, Jan-
Philipp
Sent: Monday, July 9, 2018 9:42 PM
To: r-help@r-project.org<mailto:r-help@r-project.org>
Subject: [R] (no subject)

Dear all,


In the end I try to run a system dynamics simulation in R using the package
deSolve.
Therefore I need an auxiliary list (auxs) the model can refer to when it the
functions need an auxiliary value.

I used a manual list:

auxs <- c( aSplitSN=0.4 , aSplitLN=0.6, aSplitSR1=0 , aSplitLR1=1, aSplitSR2=0 ,
aSplitLR2=1, aSplitSR3=0 , aSplitLR3=1, aSalesNR=0.92, aSalesRR=0.08, […])

This is vector not list.
auxs <- c( aSplitSN=0.4 , aSplitLN=0.6, aSplitSR1=0 , aSplitLR1=1, aSplitSR2=0)
is.vector(auxs)
[1] TRUE
is.list(auxs)
[1] FALSE


this way everything worked well.

Now I want to use a matrix with different values for each of the auxiliaries in
order to run different scenarios. Therefore I created a csv document wich I read
in:

csv1  <- read.csv("180713_Taguchi Robust Design Test_180709_1745.csv", sep
= ";")

list_csv <- csv1[1,]

which is probably data frame

test<-vec[1,]
is.vector(test)
[1] FALSE
is.list(test)
[1] TRUE
is.data.frame(test)
[1] TRUE



namesauxs <- names(list_csv)

auxs1 <- as.numeric(list_csv)

names(auxs1) <- namesauxs

auxs <- auxs1


Looking at the global environment section in R studio, now both are the same,
in the value section as "Numed num"

I do not know rstudio but you could check two objects by
?identical


Yet, the model will not run using these values ultimately coming from the csv.

I wonder why do you use as.numeric in the first instance. You coud use

auxs1 <- unlist(csv1[1,])
and you should get named numeric vector. Maybe there are problems when reading 
numbers from csv file. You could check it e.g. by

str(auxs1)


What am I doing wrong here?

It would be great if you could help.

Thanks a lot in advance

Yours

Jan





[[alternative HTML version deleted]]

__
R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To 
UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních 
partnerů PRECHEZA a.s. jsou zveřejněny na: 
https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about 
processing and protection of business partner’s personal data are available on 
website: https://www.precheza.cz/en/personal-data-protection-principles/
Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a 
podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: 
https://www.precheza.cz/01-dovetek/ | This email and any documents attached to 
it may be confidential and are subject to the legally binding disclaimer: 
https://www.precheza.cz/en/01-disclaimer/



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] (no subject)

2018-07-09 Thread Werning, Jan-Philipp
Dear all,


In the end I try to run a system dynamics simulation in R using the package 
deSolve.
Therefore I need an auxiliary list (auxs) the model can refer to when it the 
functions need an auxiliary value.

I used a manual list:

auxs <- c( aSplitSN=0.4 , aSplitLN=0.6, aSplitSR1=0 , aSplitLR1=1, aSplitSR2=0 
, aSplitLR2=1, aSplitSR3=0 , aSplitLR3=1, aSalesNR=0.92, aSalesRR=0.08, […])

this way everything worked well.

Now I want to use a matrix with different values for each of the auxiliaries in 
order to run different scenarios. Therefore I created a csv document wich I 
read in:

csv1  <- read.csv("180713_Taguchi Robust Design Test_180709_1745.csv", sep = 
";")

list_csv <- csv1[1,]

namesauxs <- names(list_csv)

 auxs1 <- as.numeric(list_csv)

 names(auxs1) <- namesauxs

 auxs <- auxs1


Looking at the global environment section in R studio, now both are the same, 
in the value section as "Numed num"

Yet, the model will not run using these values ultimately coming from the csv.

What am I doing wrong here?

It would be great if you could help.

Thanks a lot in advance

Yours

Jan





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] Natural Language Processing for non-English languages with udpipe

2018-01-16 Thread Jan Wijffels
Dear R users,

I'm happy to announce the release of version 0.3 of the udpipe R package on
CRAN (https://CRAN.R-project.org/package=udpipe). The udpipe R package is a
Natural Language Processing toolkit that provides language-agnostic
'tokenization', 'parts of speech tagging', 'lemmatization', 'morphological
feature tagging' and 'dependency parsing' of raw text. Next to text
parsing, the R package also allows you to train annotation models based on
data of 'treebanks' in 'CoNLL-U' format as provided at
http://universaldependencies.org/format.html.

The R package provides direct access to language models trained on more
than 50 languages. The following languages are directly available:

afrikaans, ancient_greek-proiel, ancient_greek, arabic, basque, belarusian,
bulgarian, catalan, chinese, coptic, croatian, czech-cac, czech-cltt,
czech, danish, dutch-lassysmall, dutch, english-lines, english-partut,
english, estonian, finnish-ftb, finnish, french-partut, french-sequoia,
french, galician-treegal, galician, german, gothic, greek, hebrew, hindi,
hungarian, indonesian, irish, italian, japanese, kazakh, korean,
latin-ittb, latin-proiel, latin, latvian, lithuanian, norwegian-bokmaal,
norwegian-nynorsk, old_church_slavonic, persian, polish, portuguese-br,
portuguese, romanian, russian-syntagrus, russian, sanskrit, serbian,
slovak, slovenian-sst, slovenian, spanish-ancora, spanish, swedish-lines,
swedish, tamil, turkish, ukrainian, urdu, uyghur, vietnamese

We hope that the package will allow other R users to build natural language
applications on top of the resulting parts of speech tags, tokens,
morphological features and dependency parsing output. And we hope in
particular that applications will arise which are not limited to English
only (like the textrank R package or the cleanNLP package to name a few)

Note that the package has no external software dependencies (no java nor
python) and depends only on 2 R packages (Rcpp and data.table), which makes
the package easy to install on any platform.

The package is available on CRAN at
https://CRAN.R-project.org/package=udpipe and is developed at
https://github.com/bnosac/udpipe
A small docusaurus website is made available at
https://bnosac.github.io/udpipe/en

We hope you enjoy using it and we would like to thank Milan Straka for all
the efforts done on UDPipe as well as all persons involved in
http://universaldependencies.org

all the best,
Jan

Jan Wijffels
Statistician
www.bnosac.be  | +32 486 611708

[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Numerical stability in chisq.test

2017-12-27 Thread Jan Motl
The chisq.test contains following code:
STATISTIC <- sum(sort((x - E)^2/E, decreasing = TRUE))

However, based on book Accuracy and stability of numerical algorithms 
<http://ftp.demec.ufpr.br/CFD/bibliografia/Higham_2002_Accuracy%20and%20Stability%20of%20Numerical%20Algorithms.pdf>
 Table 4.1 on page 89, it is better to sort the data in increasing order than 
in decreasing order, when the data are non-negative.

A demonstrative example:
x = matrix(c(rep(1.1, 1)), 10^16, nrow = 10001, ncol = 1)# We have a 
vector with 1*1.1 and 1*10^16
c(sum(sort(x, decreasing = TRUE)), sum(sort(x, decreasing = FALSE)))
The result:
100010996 100011000
When we sort the data in the increasing order, we get the correct result. If we 
sort the data in the decreasing order, we get a result that is off by 4.

Shouldn't the sort be in the increasing order rather than in the decreasing 
order?

Best regards,
 Jan Motl


PS: This post is based on discussion on 
https://stackoverflow.com/questions/47847295/why-does-chisq-test-sort-data-in-descending-order-before-summation
 
<https://stackoverflow.com/questions/47847295/why-does-chisq-test-sort-data-in-descending-order-before-summation>.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] release of version 0.2 of the textrank package

2017-12-21 Thread Jan Wijffels
Hello R users,

I'm pleased to announce the release of version 0.2 of the textrank package
on CRAN: https://CRAN.R-project.org/package=textrank

*The package is a natural language processing package which allows one to
summarize text by finding*
*- relevant sentences*
*- relevant keywords*

This is done by constructing a sentence network which finds how sentences
are related to one another (word overlap). On that network Google Pagerank
is used in order to find relevant sentences.

In a similar way 'textrank' can also be used to extract keywords. How? A
word network is constructed by looking if words are following one another.
On top of that network the 'Pagerank' algorithm is applied to extract
relevant words. Relevant words which are following one another are next
pasted together to get keywords.

The package has a vignette at
https://cran.r-project.org/web/packages/textrank/vignettes/textrank.html
and it also plays nicely with the udpipe package ​
https://CRAN.R-project.org/package=udpipe which is good for parts-of-speech
tagging, lemmatisation, dependency parsing and general NLP processing.

​all the best,
Jan


Jan Wijffels
Statistician
www.bnosac.be

[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] DeSolve Package and Moving Average

2017-11-29 Thread Werning, Jan-Philipp
Dear all,


I am using the DeSolve Package to simulate a system dynamics model. At the 
problematic point in the model, I basically want to decide how many products 
shall be produced to be sold. In order to determine the amount a basic 
forecasting model of using the average of the last 12 time periods shall be 
used. My code looks like the following.

“ […]

# Time units in month
START<-0; FINISH<-120; STEP<-1

# Set seed for reproducability

 set.seed(123)

# Create time vector
simtime  <- seq(START, FINISH, by=STEP)

# Create a stock vector with initial values
stocks   <- c([…])

# Create an aux vector for the fixed aux values
auxs<- c([…])


model <- function(time, stocks, auxs){
  with(as.list(c(stocks, auxs)),{

[… “lots of aux, flow, and stock functions” … ]


aMovingAverage  <-  
ifelse(exists("ResultsSimulation")=="FALSE",1,movavg(ResultsSimulation$TotalSales,
 12, type = "s”))


return (list(c([…]))

  })
}

# Call Solver, and store results in a data frame
ResultsSimulation <-  data.frame(ode(y=stocks, times=simtime, func = model,
  parms=auxs, method="euler"))

[…]”

My problem is, that the moving average (function: movavg) is only computed once 
and the same value is used in every timestep of the model. I.e. When running 
the model for the first time, 1 is used, running it for the next time the 
total sales value of the first timestep is used. Since only one timestep 
exists, this is logical. Yet  I would expect the movavg function to produce a 
new value in each of the 120 timesteps, as it is the case with all other flow, 
stock and aux calculations as well.

It would be great if you could help me with fixing this problem.


Many thanks in advance!

Yours,

Jan





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R and LINGO?

2017-11-10 Thread Jan Olsen Røyland
Hei
Im struggling with this problem:
b) Another company wants to compose the optimal project portfolio based on the 
following 5-
year project proposals. In the table, the cash flow for each project in each 
year is shown.
Project 1 Project 2 Project 3 Project 4 Project 5 Project 6
1st year of the project -58 -32 -18 -31 -33 -39
2nd year of the project 17 17 11 4 21 30
3rd year of the project 26 30 13 19 20 9
4th year of the project 18 7 4 7 22 13
5th year of the project 40 6 7 17 6 13
In this case, the company can also choose which year each project should 
commence. These six
candidate projects can begin either in 2018, in 2019 or in 2020, or not at all.
The current proposal is to undertake project 1, 2, 3 and 5, with project 3 and 
5 starting in 2018,
project 2 in 2019 and project 1 in 2020. Available funds by the end of year 
2017 will be 70 mill.
The resulting cash flow is given in the following table:
Project 1 Project 2 Project 3 Project 5
Total cash flow
from projects
Available
funds
2017 70
2018 -18 -33 -51 19
2019 -32 11 21 0 19
2020 -58 17 13 20 -8 11
2021 17 30 4 22 73 84
2022 26 7 7 6 46 130
2023 18 6 24 154
2024 40 40 194
Formulate an optimization model in LINGO to determine which projects to 
undertake, and in which
years. The goal is to maximize available funds by the end of year 2024, while 
making sure that
available funds are always non-negative throughout the planning horizon. How 
much can the
improve compared to the current proposal? (For simplicity, assume zero discount 
rate.)

Med Vennelig Hilsen
Jan Olsen R�yland


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Linear regression with tranformed dependant variable

2017-10-23 Thread kende jan via R-help
Dear all, I am trying to fit a multiple linear regression model with a 
transformed dependant variable (the normality assumption was not verified...). 
I have realised a sqrt(variable) transformation... The results are great, but I 
don't know how to interprete the beta coefficients... Is it possible to do 
another transformation to get interpretable beta coefficients to express the 
variations in the original untransformed dependant variable ? Thank you very 
much for your help!Noémie 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help understanding why glm and lrm.fit runs with my data, but lrm does not

2017-09-14 Thread Jan van der Laan


With lrm.fit you are fitting a completely different model. One of the 
things lrm does, is preparing the input for lrm.fit which in this case 
means that dummy variables are generated for categorical variables such 
as 'KILLIP'.


The error message means that model did not converge after the maximum 
number of iterations. One possible solution is to try to increase the 
maximum number of iterations, e.g.:


fit1 <- lrm(DAY30~AGE+HYP+KILLIP+HRT+ANT, data = gusto2, maxit = 100)

HTH,

Jan



On 14-09-17 09:30, Bonnett, Laura wrote:

Dear all,

I am using the publically available GustoW dataset.  The exact version I am 
using is available here: 
https://drive.google.com/open?id=0B4oZ2TQA0PAoUm85UzBFNjZ0Ulk

I would like to produce a nomogram for 5 covariates - AGE, HYP, KILLIP, HRT and ANT.  I 
have successfully fitted a logistic regression model using the "glm" function 
as shown below.

library(rms)
gusto <- spss.get("GustoW.sav")
fit <- 
glm(DAY30~AGE+HYP+factor(KILLIP)+HRT+ANT,family=binomial(link="logit"),data=gusto,x=TRUE,y=TRUE)

However, my review of the literature and other websites suggest I need to use "lrm" for 
the purposes of producing a nomogram.  When I run the command using "lrm" (see below) I 
get an error message saying:
Error in lrm(DAY30 ~ AGE + HYP + KILLIP + HRT + ANT, gusto2) :
   Unable to fit model using "lrm.fit"

My code is as follows:
gusto2 <- gusto[,c(1,3,5,8,9,10)]
gusto2$HYP <- factor(gusto2$HYP, labels=c("No","Yes"))
gusto2$KILLIP <- factor(gusto2$KILLIP, labels=c("1","2","3","4"))
gusto2$HRT <- factor(gusto2$HRT, labels=c("No","Yes"))
gusto2$ANT <- factor(gusto2$ANT, labels=c("No","Yes"))
var.labels=c(DAY30="30-day Mortality", AGE="Age in Years", KILLIP="Killip Class", 
HYP="Hypertension", HRT="Tachycardia", ANT="Anterior Infarct Location")
label(gusto2)=lapply(names(var.labels),function(x) 
label(gusto2[,x])=var.labels[x])

ddist = datadist(gusto2)
options(datadist='ddist')

fit1 <- lrm(DAY30~AGE+HYP+KILLIP+HRT+ANT,gusto2)

Error in lrm(DAY30 ~ AGE + HYP + KILLIP + HRT + ANT, gusto2) :
   Unable to fit model using "lrm.fit"

Online solutions to this problem involve checking whether any variables are 
redundant.  However, the results for my data suggest  that none are.
redun(~AGE+HYP+KILLIP+HRT+ANT,gusto2)

Redundancy Analysis

redun(formula = ~AGE + HYP + KILLIP + HRT + ANT, data = gusto2)

n: 2188 p: 5nk: 3

Number of NAs:   0

Transformation of target variables forced to be linear

R-squared cutoff: 0.9   Type: ordinary

R^2 with which each variable can be predicted from all other variables:

AGEHYP KILLIPHRTANT
  0.028  0.032  0.053  0.046  0.040

No redundant variables

I've also tried just considering "lrm.fit" and that code seems to run without 
error too:
lrm.fit(cbind(gusto2$AGE,gusto2$KILLIP,gusto2$HYP,gusto2$HRT,gusto2$ANT),gusto2$DAY30)

Logistic Regression Model

  lrm.fit(x = cbind(gusto2$AGE, gusto2$KILLIP, gusto2$HYP, gusto2$HRT,
  gusto2$ANT), y = gusto2$DAY30)

Model Likelihood DiscriminationRank Discrim.
   Ratio Test   Indexes   Indexes
  Obs  2188LR chi2 233.59R2   0.273C   0.846
   0   2053d.f. 5g1.642Dxy 0.691
   1135Pr(> chi2) <0.0001gr   5.165gamma   0.696
  max |deriv| 4e-09  gp   0.079tau-a   0.080
 Brier0.048

Coef S.E.   Wald Z Pr(>|Z|)
  Intercept -13.8515 0.9694 -14.29 <0.0001
  x[1]0.0989 0.0103   9.58 <0.0001
  x[2]0.9030 0.1510   5.98 <0.0001
  x[3]1.3576 0.2570   5.28 <0.0001
  x[4]0.6884 0.2034   3.38 0.0007
  x[5]0.6327 0.2003   3.16 0.0016

I was therefore hoping someone would explain why the "lrm" code is producing an error message, 
while "lrm.fit" and "glm" do not.  In particular I would welcome a solution to ensure I 
can produce a nomogram.

Kind regards,
Laura

Dr Laura Bonnett
NIHR Post-Doctoral Fellow

Department of Biostatistics,
Waterhouse Building, Block F,
1-5 Brownlow Street,
University of Liverpool,
Liverpool,
L69 3GL

0151 795 9686
l.j.bonn...@liverpool.ac.uk



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to extract values after using metabin from the package meta?

2017-05-04 Thread jan Pierre
Hello,

I’m trying to do a meta-analysis with R. I tried to use the function
metabin from the package meta :


data <- data.frame(matrix(rnorm(40,25), nrow=17, ncol=8))
centres<-c("SVP","NANTES","STRASBOURG","GRENOBLE","ANGERS","TOULON","MARSEILLE","COLMAR","BORDEAUX","RENNES","VALENCE","CAEN","NANCY")
rownames(data) = centres
colnames(data) =
c("case_exposed","witness_exposed","case_nonexposed","witness_nonexposed","exposed","nonexposed","case","witness")
metabin( data$case_exposed, data$case, data$witness_exposed, data$witness,
studlab=centres,
   data=data, sm="OR")

where data_meta is a data frame with the number of case_exposed, case_data,
witness_exposed, witness for each centre.

I obtain after using metabin :

How can I extract the values of OR and 95%-CI in the fixed effect model and
the random effects model? I want to put these data in another array.

I tried to use summary, but it doesn’t change anything.

Thanks for your help.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R 3.4.0 on Windows 7 Home Premium installed apparently fine, but packages failing to load ...

2017-04-25 Thread Jan Galkowski
Hello!

I welcome the new *R* 3.4.0.  I installed it on my Windows 7 Home
Premium [Service Pack 1, updated to latest, running on an HP AMD(Phenom)
II 955 X4 Processor, 3.20 GHz, 16 GB RAM, 64-bit, with lots of free
storage on disk and a solid state disk for virtual cache]. It was
installed atop the previous *R* version.
I tried the usual *update.packages(ask=FALSE)* and found many instances
of packages, e.g., *ctmm*, *SweaveListingUtils*, *plotly*,
*scatterplot3d*, *startupmsg *which failed to install, apparently
because of an attempt to include an install of i386 instead of only x64.
I was using the Berkeley mirror via *https:*> 
> R version 3.4.0 (2017-04-21) -- "You Stupid Darkness" 
> Copyright (C) 2017 The R Foundation for Statistical Computing
> Platform: x86_64-w64-mingw32/x64 (64-bit)

I opted *not* to install from source those packages requiring
compilation, although Rtools was installed and when installing a package
before, FORTRAN compilations succeed. I also have an MSVC++ installed,
but I've not gotten that to work on Windows, unlike when I install on
Ubuntu machines.
Unfortunately, install at least for these packages fails:

Do you want to install from sources the packages which need compilation?y/n: n
Package which is only available in source form, and may need
compilation of  C/C++/Fortran: ‘gpclib’
Do you want to attempt to install these from sources?
y/n: n
trying URL '
https://mirrors.nics.utk.edu/cran/bin/windows/contrib/3.4/crosstalk_1.0.0.zip'Content
 type 'application/zip' length 598840 bytes (584 KB)
downloaded 584 KB

trying URL '
https://mirrors.nics.utk.edu/cran/bin/windows/contrib/3.4/deldir_0.1-12.zip'Content
 type 'application/zip' length 173098 bytes (169 KB)
downloaded 169 KB

trying URL '
https://mirrors.nics.utk.edu/cran/bin/windows/contrib/3.4/distr_2.6.zip'Content 
type 'application/zip' length 2226722 bytes (2.1 MB)
downloaded 2.1 MB

trying URL '
https://mirrors.nics.utk.edu/cran/bin/windows/contrib/3.4/distrEx_2.6.zip'Content
 type 'application/zip' length 720392 bytes (703 KB)
downloaded 703 KB

trying URL '
https://mirrors.nics.utk.edu/cran/bin/windows/contrib/3.4/foreign_0.8-67.zip'Content
 type 'application/zip' length 309745 bytes (302 KB)
downloaded 302 KB

trying URL '
https://mirrors.nics.utk.edu/cran/bin/windows/contrib/3.4/gam_1.14-3.zip'Content
 type 'application/zip' length 319049 bytes (311 KB)
downloaded 311 KB

trying URL '
https://mirrors.nics.utk.edu/cran/bin/windows/contrib/3.4/lattice_0.20-34.zip'Content
 type 'application/zip' length 731408 bytes (714 KB)
downloaded 714 KB

trying URL '
https://mirrors.nics.utk.edu/cran/bin/windows/contrib/3.4/MASS_7.3-45.zip'Content
 type 'application/zip' length 1173817 bytes (1.1 MB)
downloaded 1.1 MB

trying URL '
https://mirrors.nics.utk.edu/cran/bin/windows/contrib/3.4/rpart_4.1-10.zip'Content
 type 'application/zip' length 950721 bytes (928 KB)
downloaded 928 KB

trying URL '
https://mirrors.nics.utk.edu/cran/bin/windows/contrib/3.4/sem_3.1-8.zip'Content 
type 'application/zip' length 1110127 bytes (1.1 MB)
downloaded 1.1 MB

trying URL '
https://mirrors.nics.utk.edu/cran/bin/windows/contrib/3.4/SparseM_1.76.zip'Content
 type 'application/zip' length 952285 bytes (929 KB)
downloaded 929 KB

trying URL '
https://mirrors.nics.utk.edu/cran/bin/windows/contrib/3.4/survival_2.41-2.zip'Content
 type 'application/zip' length 5426933 bytes (5.2 MB)
downloaded 5.2 MB

trying URL '
https://mirrors.nics.utk.edu/cran/bin/windows/contrib/3.4/VineCopula_2.1.1.zip'Content
 type 'application/zip' length 1106702 bytes (1.1 MB)
downloaded 1.1 MB

package ‘crosstalk’ successfully unpacked and MD5 sums checked
package ‘deldir’ successfully unpacked and MD5 sums checked
package ‘distr’ successfully unpacked and MD5 sums checked
package ‘distrEx’ successfully unpacked and MD5 sums checked
package ‘foreign’ successfully unpacked and MD5 sums checked
package ‘gam’ successfully unpacked and MD5 sums checked
package ‘lattice’ successfully unpacked and MD5 sums checked
package ‘MASS’ successfully unpacked and MD5 sums checked
package ‘rpart’ successfully unpacked and MD5 sums checked
package ‘sem’ successfully unpacked and MD5 sums checked
package ‘SparseM’ successfully unpacked and MD5 sums checked
package ‘survival’ successfully unpacked and MD5 sums checked
package ‘VineCopula’ successfully unpacked and MD5 sums checked

The downloaded binary packages are in
C:\Users\Jan\AppData\Local\Temp\Rtmpyekpgu\downloaded_packages
installing the source packages ‘ctmm’, ‘plotly’, ‘scatterplot3d’,
‘startupmsg’, ‘SweaveListingUtils’
trying URL '
https://mirrors.nics.utk.edu/cran/src/contrib/ctmm_0.3.6.tar.gz'Content type 
'application/x-gzip' length 731682 bytes (714 KB)
downloaded 714 KB

trying URL '
https://mirrors.nics.utk.edu/cran/src/contrib/plotly_4.6.0.tar.gz'Content type 
'application/x-gzip' length 980458 bytes (957 KB)
downloaded 957 KB

trying URL '
https://mirrors.nics.utk.edu/cran/src/contri

Re: [ESS] Curly brace indentation

2016-12-08 Thread Jan T Kim via ESS-help
Hi Martin and All,

thanks for your reply, please see my comments inline below:

On Thu, Dec 08, 2016 at 09:09:10AM +0100, Martin Maechler wrote:
> >>>>> Jan T Kim via ESS-help <ess-help@r-project.org>
> >>>>> on Tue, 6 Dec 2016 01:11:20 + writes:
> 
> > Hello All,
> > since some time, I get the following indentation behaviour: If I type
> 
> > f <- function(x)
> > {
> > return(x * x);
> > }
> > 
> > this gets indented as
> > 
> > f <- function(x)
> > {
> >   return(x * x);
> >   }
> 
> 
> > i.e. the closing curly brace is not vertically aligned with the opening 
> one.
> 
> What exactly is  "if I type" ?
> - in an emacs buffer for a foo.R file (i.e. a buffer in R-mode),
> right ?

yes -- I used a filename ending with ".R", the mode shows as
"(ESS[S] [none] ElDoc)".

With the "if I type" sample I basically mean to express that the
indentation after the opening curly brace is automatically generated,
rather than typed by me.

> Well, I don't see this
>   (and I would never use unnecessary  ' ; '  nor
>unnecessary return(.)  .. but that's not really relevant here)
> 
> Specifically, after typing [Enter] at the end of the line
>return(x * x);
> of course the cursor on the next line is below the first letter
> 'r' of 'return';

yes, that's what I get as well...

> but then if you type "}"  and [Enter] or [Tab]  then it alings
> correctly.
> 
> Don't you see that?

... but no, that "electric" alignment of the closing curly brace
is not what I get. The brace doesn't move to the left, it appears
below the "r" of "return" and stays there.

> If yes,  how could ESS behave any better?

The behaviour you describe is what I would like.

As some additional detail about the system, this is a newly
installed Ubuntu 16.04.1 LTS (Xenial Xerus), and the ESS package
is

ii  ess16.10-1xenia all  Emacs mode for statistical ...

Best regards, Jan


> > If I then go on and indent the buffer (C-x h C-M-\), the indentation is
> > updated to
> 
> 
> > f <- function(x)
> > {
> > return(x * x);
> > }
> 
> > so the opening and closing curly braces are now vertically aligned.
> 
> > This behaviour started several months ago. Reviewing the change logs,
> > I speculate that upgrading (via Ubuntu package manager) to a package
> > providing 15.09, where "the indentation logic has been refactored",
> > may be the cause of the change, but as I've done little R coding for
> > a while I can't really pinpoint this.
> 
> > I recently got a new computer at work and used that opportunity to
> > check that the behaviour occurs with a new account, i.e. without any
> > ~/.emacs file.
> 
> > After some code delving and hacking I've managed to adjust the electric
> > curly braces by adding this to my .emacs:
> 
> > (defun jtk-ess-electric-brace (arg)
> > "modified / extended ess-electric-brace"
> > (interactive "P")
> > (progn
> > ; (message "modified ess-electric-brace running")
> > (ess-electric-brace arg)
> > (ess-indent-command)
> > )
> > )
> 
> 
> > (defun jtk-ess-mode-hook ()
> > (progn
> > (local-set-key (kbd "{") 'jtk-ess-electric-brace)
> > (local-set-key (kbd "}") 'jtk-ess-electric-brace)
> > )
> > )
> 
> > So essentially I have the brace indented immediately after inserting
>     > it via the original ess-electric-brace command. However, this solution
> > is not 100% perfect as the indentation of closing braces occurs only
> > after some delay caused by briefly flashing the cursor at the 
> corresponding
> > opening brace.
> 
> > Quite possibly I'm using a clumsy approach to try to get indentation 
> during
> > typing consistent with that produced by indent-region, so suggestions 
> where
> > I may have messed up are welcome.
> 
> > Best regards & thanks in advance for any pointers, Jan
> > -- 
> > +- Jan T. Kim ---+
> > | email: jtt...@gmail.com|
> > | WWW:   http://www.jtkim.dreamhosters.com/  |
> > *-=<  hierarchical systems are for files, not for humans  >=-*
> 
> > __
> > ESS-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/ess-help

__
ESS-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/ess-help


[ESS] Curly brace indentation

2016-12-05 Thread Jan T Kim via ESS-help
Hello All,

since some time, I get the following indentation behaviour: If I type

f <- function(x)
{
return(x * x);
}

this gets indented as

f <- function(x)
{
return(x * x);
}

i.e. the closing curly brace is not vertically aligned with the opening one.

If I then go on and indent the buffer (C-x h C-M-\), the indentation is
updated to


f <- function(x)
{
return(x * x);
}

so the opening and closing curly braces are now vertically aligned.

This behaviour started several months ago. Reviewing the change logs,
I speculate that upgrading (via Ubuntu package manager) to a package
providing 15.09, where "the indentation logic has been refactored",
may be the cause of the change, but as I've done little R coding for
a while I can't really pinpoint this.

I recently got a new computer at work and used that opportunity to
check that the behaviour occurs with a new account, i.e. without any
~/.emacs file.

After some code delving and hacking I've managed to adjust the electric
curly braces by adding this to my .emacs:

(defun jtk-ess-electric-brace (arg)
  "modified / extended ess-electric-brace"
  (interactive "P")
  (progn
; (message "modified ess-electric-brace running")
(ess-electric-brace arg)
(ess-indent-command)
  )
)


(defun jtk-ess-mode-hook ()
  (progn
(local-set-key (kbd "{") 'jtk-ess-electric-brace)
(local-set-key (kbd "}") 'jtk-ess-electric-brace)
  )
)

So essentially I have the brace indented immediately after inserting
it via the original ess-electric-brace command. However, this solution
is not 100% perfect as the indentation of closing braces occurs only
after some delay caused by briefly flashing the cursor at the corresponding
opening brace.

Quite possibly I'm using a clumsy approach to try to get indentation during
typing consistent with that produced by indent-region, so suggestions where
I may have messed up are welcome.

Best regards & thanks in advance for any pointers, Jan
-- 
 +- Jan T. Kim ---+
 | email: jtt...@gmail.com|
 | WWW:   http://www.jtkim.dreamhosters.com/  |
 *-=<  hierarchical systems are for files, not for humans  >=-*

__
ESS-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/ess-help


Re: [R] function which returns number of occurrences of a pattern in string

2016-10-24 Thread Jan Kacaba
Bob and Max, I thank you. It helped me much.

2016-10-21 3:47 GMT+02:00 Bob Rudis <b...@rud.is>:

> `stringi::stri_count()`
>
> I know that the `stringr` pkg saves some typing (it wraps the
> `stringi` pkg), but you should really just use the `stringi` package.
> It has many more very useful functions with not too much more typing.
>
> On Thu, Oct 20, 2016 at 5:47 PM, Jan Kacaba <jan.kac...@gmail.com> wrote:
> > Hello dear R-help
> >
> > I tried to find function which returns number of occurrences of a pattern
> > in string. The closest match I've found is str_locate_all in stringr
> > package. I can use str_locate_all but write my function but I don't want
> > reinvent wheel.
> >
> > JK
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] function which returns number of occurrences of a pattern in string

2016-10-20 Thread Jan Kacaba
Hello dear R-help

I tried to find function which returns number of occurrences of a pattern
in string. The closest match I've found is str_locate_all in stringr
package. I can use str_locate_all but write my function but I don't want
reinvent wheel.

JK

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] strange output of cat function used in recursive function

2016-10-01 Thread Jan Kacaba
2016-10-01 18:02 GMT+02:00 David Winsemius <dwinsem...@comcast.net>:
>
>> On Oct 1, 2016, at 8:44 AM, Jan Kacaba <jan.kac...@gmail.com> wrote:
>>
>> Hello Dear R-help
>>
>> I  tried to understand how recursive programming works in R. Bellow is
>> simple recursive function.
>>
>> binary1 <- function(n) {
>>  if(n > 1) {
>>binary(as.integer(n/2))
>>  }
>>  cat(n %% 2)
>> }
>
> Did you mean to type "binary1(as.integer(n)"?

Yes I meant that.

>> When I call binary1(10) I get 1010. I believe that cat function stores
>> value to a buffer appending values as recursion proceeds and at the
>> end it prints the buffer. Am I right?
>
> No. Read the ?cat help page. It returns NULL. The material you see at the 
> console is a side-effect.
>>
>> I tried to modify the function to get some understanding:
>>
>> binary2 <- function(n) {
>>  if(n > 1) {
>>binary2(as.integer(n/2))
>>  }
>>  cat(n %% 2, sep=",")
>> }
>>
>> With call binary2(10) I get also 1010. Why the output is not separated
>> by commas?
>
> I think because there is nothing to separate when it prints (since there was 
> no "buffer".

If I use function:
binary3 <- function(n) {
if(n > 1) {
   binary3(as.integer(n/2))
  }
   cat(n %% 2, ",")
 }

and call binary3(10) the console output is separated. So there must be
some kind of buffer and also it looks like there is some inconsistency
in how cat function behaves. Probably there is other explanation.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] strange output of cat function used in recursive function

2016-10-01 Thread Jan Kacaba
Hello Dear R-help

I  tried to understand how recursive programming works in R. Bellow is
simple recursive function.

binary1 <- function(n) {
  if(n > 1) {
binary(as.integer(n/2))
  }
  cat(n %% 2)
}
When I call binary1(10) I get 1010. I believe that cat function stores
value to a buffer appending values as recursion proceeds and at the
end it prints the buffer. Am I right?

I tried to modify the function to get some understanding:

binary2 <- function(n) {
  if(n > 1) {
binary2(as.integer(n/2))
  }
  cat(n %% 2, sep=",")
}

With call binary2(10) I get also 1010. Why the output is not separated
by commas?

If I use in binary2 function cat(n %% 2, ",") on last line, the output
is separated. Outside recursive function the cat function prints
separated output in both cases e.g. cat(c(1:10), sep=",") and
cat(c(1:10), ",")

Derek

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R Studio: Run script upon saving or exiting

2016-08-13 Thread Jan Kacaba
Dear R help,

I would like to run script upon saving project files or exiting the R Studio.
For example I would like to backup whole project in another directory.
The backup directory should be named such that incremental version
number will added to project name.

Is it somehow possible?  Even better would be if someone can also
quickly go through file versions.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] print all variables inside function

2016-05-23 Thread Jan Kacaba
Hello dear R-help

I would like to use some short and simple names multiple times inside
one script without collisions. I need to wrap the variables inside
some object. I know I can use class function or environment. For
example as follows:

exmp1<-function(){


# knowns
pa=0.35
pb=0.35
pc=0.30
pad=0.015
pbd=0.010
pcd=0.020



# unknowns
pd=pa*pad+pb*pbd+pc*pcd
pdc=pc*pcd/pd
pda=pa*pad/pd
pba=pb*pbd/pd


y<-c(pad=pad,pbd=pbd,pcd=pcd,pd=pd,pdc=pdc,pda=pda,pba=pba) # this
line I would like to automate so I don't have to write it every time
return(y)
}
output<-exmp1()

Is it somehow possible to print 'Unknows' and 'Knowns' from exmp1
function without the need of explicitly write the 'y' line which puts
all variables inside list? For example with an imaginary function
'fprint' which takes exmp1 as the input: fprint(exmp1).

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] break string at specified possitions

2016-05-17 Thread Jan Kacaba
Excellent Hervé, thank you.

2016-05-13 11:48 GMT+02:00 Hervé Pagès <hpa...@fredhutch.org>:
> Hi,
>
> Here is the Biostrings solution in case you need to chop a long
> string into hundreds or thousands of fragments (a situation where
> base::substring() is very inefficient):
>
>   library(Biostrings)
>
>   ## Call as.character() on the result if you want it back as
>   ## a character vector.
>   fast_chop_string <- function(x, ends)
>   {
> if (!is(x, "XString"))
> x <- as(x, "XString")
> extractAt(x, at=PartitioningByEnd(ends))
>   }
>
> Will be much faster than substring (e.g. 100x or 1000x) when
> chopping a string like a Human chromosome into hundreds or
> thousands of fragments.
>
> Biostrings is a Bioconductor package:
>
>   https://bioconductor.org/packages/Biostrings
>
> Cheers,
> H.
>
>
>
> On 05/12/2016 01:18 AM, Jan Kacaba wrote:
>>
>> Nice solution Jim, thank you.
>>
>>
>>
>> 2016-05-12 2:45 GMT+02:00 Jim Lemon <drjimle...@gmail.com>:
>>>
>>> Hi again,
>>> Sorry, that should be:
>>>
>>> chop_string<-function(x,ends) {
>>>   starts<-c(1,ends[-length(ends)]+1)
>>>   return(substring(x,starts,ends))
>>> }
>>>
>>> Jim
>>>
>>> On Thu, May 12, 2016 at 10:05 AM, Jim Lemon <drjimle...@gmail.com> wrote:
>>>>
>>>> Hi Jan,
>>>> This might be helpful:
>>>>
>>>> chop_string<-function(x,ends) {
>>>>   starts<-c(1,ends[-length(ends)]-1)
>>>>   return(substring(x,starts,ends))
>>>> }
>>>>
>>>> Jim
>>>>
>>>>
>>>> On Thu, May 12, 2016 at 7:23 AM, Jan Kacaba <jan.kac...@gmail.com>
>>>> wrote:
>>>>>
>>>>> Here is my attempt at function which computes margins from positions.
>>>>>
>>>>> require("stringr")
>>>>> require("dplyr")
>>>>>
>>>>> ends<-seq(10,100,8)  # end margins
>>>>> test_string<-"Lorem ipsum dolor sit amet, consectetuer adipiscing
>>>>> elit. Aliquam in lorem sit amet leo accumsan lacinia."
>>>>>
>>>>> sekoj=function(ends){
>>>>>l_ends<-length(ends)
>>>>>begs=vector(mode="integer",l_ends)
>>>>>begs[1]=1
>>>>>for (i in 2:(l_ends)){
>>>>>  begs[i]<-ends[i-1]+1
>>>>>}
>>>>>margs<-rbind(begs,ends)
>>>>>margs<-cbind(margs,c(ends[l_ends]+1,-1))
>>>>>#rownames(margs)<-c("beg","end")
>>>>>return(margs)
>>>>> }
>>>>> margins<-sekoj(ends)
>>>>> str_sub(test_string,margins[1,],margins[2,]) %>% print
>>>>>
>>>>> Code to run in browser:
>>>>> http://www.r-fiddle.org/#/fiddle?id=rVmNVxDV
>>>>>
>>>>> 2016-05-11 23:12 GMT+02:00 Bert Gunter <bgunter.4...@gmail.com>:
>>>>>>
>>>>>> Dunno -- but you might have a look at Hadley Wickham's 'stringr'
>>>>>> package:
>>>>>> https://cran.r-project.org/web/packages/stringr/stringr.pdf
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Bert
>>>>>>
>>>>>>
>>>>>> Bert Gunter
>>>>>>
>>>>>> "The trouble with having an open mind is that people keep coming along
>>>>>> and sticking things into it."
>>>>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>>>>>
>>>>>>
>>>>>> On Wed, May 11, 2016 at 1:12 PM, Jan Kacaba <jan.kac...@gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Dear R-help
>>>>>>>
>>>>>>> I would like to split long string at specified precomputed positions.
>>>>>>> 'substring' needs beginings and ends. Is there a native function
>>>>>>> which
>>>>>>> accepts positions so I don't have to count second argument?
>>>>>>>
>>>>>>> For example I have vector of possitions pos<-c(5,10,19). Substring
>>>>>>> needs input first=c(1,6,11) and last=c(5,10,19). There is no prob

Re: [R] break string at specified possitions

2016-05-12 Thread Jan Kacaba
Nice solution Jim, thank you.



2016-05-12 2:45 GMT+02:00 Jim Lemon <drjimle...@gmail.com>:
> Hi again,
> Sorry, that should be:
>
> chop_string<-function(x,ends) {
>  starts<-c(1,ends[-length(ends)]+1)
>  return(substring(x,starts,ends))
> }
>
> Jim
>
> On Thu, May 12, 2016 at 10:05 AM, Jim Lemon <drjimle...@gmail.com> wrote:
>> Hi Jan,
>> This might be helpful:
>>
>> chop_string<-function(x,ends) {
>>  starts<-c(1,ends[-length(ends)]-1)
>>  return(substring(x,starts,ends))
>> }
>>
>> Jim
>>
>>
>> On Thu, May 12, 2016 at 7:23 AM, Jan Kacaba <jan.kac...@gmail.com> wrote:
>>> Here is my attempt at function which computes margins from positions.
>>>
>>> require("stringr")
>>> require("dplyr")
>>>
>>> ends<-seq(10,100,8)  # end margins
>>> test_string<-"Lorem ipsum dolor sit amet, consectetuer adipiscing
>>> elit. Aliquam in lorem sit amet leo accumsan lacinia."
>>>
>>> sekoj=function(ends){
>>>   l_ends<-length(ends)
>>>   begs=vector(mode="integer",l_ends)
>>>   begs[1]=1
>>>   for (i in 2:(l_ends)){
>>> begs[i]<-ends[i-1]+1
>>>   }
>>>   margs<-rbind(begs,ends)
>>>   margs<-cbind(margs,c(ends[l_ends]+1,-1))
>>>   #rownames(margs)<-c("beg","end")
>>>   return(margs)
>>> }
>>> margins<-sekoj(ends)
>>> str_sub(test_string,margins[1,],margins[2,]) %>% print
>>>
>>> Code to run in browser:
>>> http://www.r-fiddle.org/#/fiddle?id=rVmNVxDV
>>>
>>> 2016-05-11 23:12 GMT+02:00 Bert Gunter <bgunter.4...@gmail.com>:
>>>> Dunno -- but you might have a look at Hadley Wickham's 'stringr' package:
>>>> https://cran.r-project.org/web/packages/stringr/stringr.pdf
>>>>
>>>> Cheers,
>>>>
>>>> Bert
>>>>
>>>>
>>>> Bert Gunter
>>>>
>>>> "The trouble with having an open mind is that people keep coming along
>>>> and sticking things into it."
>>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>>>
>>>>
>>>> On Wed, May 11, 2016 at 1:12 PM, Jan Kacaba <jan.kac...@gmail.com> wrote:
>>>>> Dear R-help
>>>>>
>>>>> I would like to split long string at specified precomputed positions.
>>>>> 'substring' needs beginings and ends. Is there a native function which
>>>>> accepts positions so I don't have to count second argument?
>>>>>
>>>>> For example I have vector of possitions pos<-c(5,10,19). Substring
>>>>> needs input first=c(1,6,11) and last=c(5,10,19). There is no problem
>>>>> to write my own function. Just asking.
>>>>>
>>>>> Derek
>>>>>
>>>>> __
>>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide 
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] break string at specified possitions

2016-05-11 Thread Jan Kacaba
Here is my attempt at function which computes margins from positions.

require("stringr")
require("dplyr")

ends<-seq(10,100,8)  # end margins
test_string<-"Lorem ipsum dolor sit amet, consectetuer adipiscing
elit. Aliquam in lorem sit amet leo accumsan lacinia."

sekoj=function(ends){
  l_ends<-length(ends)
  begs=vector(mode="integer",l_ends)
  begs[1]=1
  for (i in 2:(l_ends)){
begs[i]<-ends[i-1]+1
  }
  margs<-rbind(begs,ends)
  margs<-cbind(margs,c(ends[l_ends]+1,-1))
  #rownames(margs)<-c("beg","end")
  return(margs)
}
margins<-sekoj(ends)
str_sub(test_string,margins[1,],margins[2,]) %>% print

Code to run in browser:
http://www.r-fiddle.org/#/fiddle?id=rVmNVxDV

2016-05-11 23:12 GMT+02:00 Bert Gunter <bgunter.4...@gmail.com>:
> Dunno -- but you might have a look at Hadley Wickham's 'stringr' package:
> https://cran.r-project.org/web/packages/stringr/stringr.pdf
>
> Cheers,
>
> Bert
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Wed, May 11, 2016 at 1:12 PM, Jan Kacaba <jan.kac...@gmail.com> wrote:
>> Dear R-help
>>
>> I would like to split long string at specified precomputed positions.
>> 'substring' needs beginings and ends. Is there a native function which
>> accepts positions so I don't have to count second argument?
>>
>> For example I have vector of possitions pos<-c(5,10,19). Substring
>> needs input first=c(1,6,11) and last=c(5,10,19). There is no problem
>> to write my own function. Just asking.
>>
>> Derek
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] break string at specified possitions

2016-05-11 Thread Jan Kacaba
Dear R-help

I would like to split long string at specified precomputed positions.
'substring' needs beginings and ends. Is there a native function which
accepts positions so I don't have to count second argument?

For example I have vector of possitions pos<-c(5,10,19). Substring
needs input first=c(1,6,11) and last=c(5,10,19). There is no problem
to write my own function. Just asking.

Derek

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] svyciprop object

2016-05-06 Thread kende jan via R-help
Hi, I'd like to access to the different elements in a svyciprop object (to the 
confidence intervals in particular...). But none of the functions I know 
works.Thank you for your help !
> grr <- svyciprop(~temp==bzz, dclus1)> grr                               2.5%  
>  97.5%temp == bzz 0.040719697 0.027622756 0.05965> attributes(grr)$names[1] 
> "temp == bzz"
$var                        as.numeric(temp == bzz)as.numeric(temp == bzz)      
 6.42377038236e-05
$ci           2.5%           97.5% 0.0276227559667 0.0596454643748 
$class[1] "svyciprop"
> grr$ciErreur dans grr$ci : $ operator is invalid for atomic vectors> 
> grr["ci"]   NA > ci(grr)Erreur : impossible de trouver la fonction "ci"


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] row names, coulmn names

2016-05-01 Thread Jan Kacaba
Hello dear R helpers,

Is it possible to have more than 1 row for column names in data.frame,
array, tbl_df? I would like to have column numbers in the first row, string
names in the second row, physical unit in third row.
How would I do it?

Derek

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] inserting row(column) in array or dataframe at specified row(column)

2016-05-01 Thread Jan Kacaba
Hello dear R users,

Is there a function or package which can insert row, column or array in
another array at specified place (row or column)?

I have made several attempts at this function optimizing both speed, code
readability and ease of use. The functions are of following format:

appcol=function(original_array, inserted_object, column_number,
overwrite=FALSE)

# If overwrite=TRUE the columns after column_number are ovewritten by
inserted_object else the columns after column_number are shifted.

Now I have started using package dplyr and it seams that there is no
inserting function either. One can only append at the end or at the
beginning of tbl_df. Is it true?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] (windows) opening document with particular exe file

2016-04-26 Thread Jan Kacaba
Hello dear R, I dont have specific task on mind just learning R.

1) Is it possible to open a document for example path1\myfile.pdf with
program path2\pdfviewer.exe ?
How would I do it in win? Does it differ in linux?

2)  Is it possible to run a program and supply to it some streams? The
streams are for example txt file or web address.

One specific task which comes to mid: I would like to draw in inkscape
programmatically with script. Is it somehow possible?

Thank you very much for any help in advance.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Accented characters, windows

2016-03-30 Thread Jan Kacaba
Duncun, thank you for your reply. My encoding is:

> Sys.getlocale('LC_CTYPE')
[1] "Czech_Czech Republic.1250"

In RStudio I use UTF-8. I tried also other recommended encodings but some
characters are still misrepresented.

I've found solution to this. To correctly display strings in RStudio I have
to convert strings:
iconv(x,"CP1250","UTF-8")

If I want to write string into file:
zz=file("myfile.txt", "w", encoding="UTF-8")
cat(x,file = zz, sep = "\n")

It seems there is no need using icon() if I just need to write string to a
file.

I hope there is no problem processing strings with other functions like
paste, strsplit, grep though.

Derek

2016-03-30 0:56 GMT+02:00 Duncan Murdoch <murdoch.dun...@gmail.com>:

> On 29/03/2016 5:39 PM, Jan Kacaba wrote:
>
>> I have problem with accented characters. My OS is Win 8.1 and I'm using
>> RStudio.
>>
>> I make string :
>> av="ěščřž"
>>
>> When I call "av" I get result bellow.
>>
>>> av
>>>
>> [1] "ìšèøž"
>>
>> The resulting characters are different. I have similar problem when I
>> write
>> string to a file. In RGUI if I call "av" it prints characters correctly,
>> but using "write" function to print string in a file results in the same
>> problem.
>>
>> Can you please help me how to deal with it?
>>
>
> You don't say what code page you're using.
>
> R in Windows has a long standing problem that it works mainly in the local
> code page, rather than working in UTF-8 as most other systems do.  (This is
> due to the fact that when the internationalization was put in, UTF-8 was
> exotic, rather than ubiquitous as it is now.)  So R can store UTF-8 strings
> on any system, but for display it converts them to the local code page, and
> that conversion can lose information if the characters aren't supported
> locally.
>
> With your string, I don't see the same thing as you, I see
>
> "ešcrž"
>
> which is also incorrect, but looks a little closer, because it does a
> better approximation in my code page.
>
> So if you think my result is better than yours, you could change your
> system to code page 437 as I'm using, but that will probably cause you
> worse problems.
>
> Probably the only short term solution that would be satisfactory is to
> stop using Windows.  At some point in the future the internal character
> handling in R needs an overhaul, but that's a really big, really thankless
> job.  Perhaps Microsoft/Revolution will donate some programmer time to do
> it, but more likely, it will wait for volunteers in R Core to do it.  I
> don't think it will happen in 2016.
>
> Duncan Murdoch
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Accented characters, windows

2016-03-29 Thread Jan Kacaba
I have problem with accented characters. My OS is Win 8.1 and I'm using
RStudio.

I make string :
av="ěščřž"

When I call "av" I get result bellow.
> av
[1] "ìšèøž"

The resulting characters are different. I have similar problem when I write
string to a file. In RGUI if I call "av" it prints characters correctly,
but using "write" function to print string in a file results in the same
problem.

Can you please help me how to deal with it?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R studio kniter

2016-03-22 Thread Jan Kacaba
Hello, is it possible to run kiniter by script instead by clicking on
button compile PDF?

Say I have "texfile.rnw" and "myscript.R". I would like to knit texfile.rnw
by runnig script "myscript.R".
In "myscript.R" I would write something like this:
knit("texfile.rnw")

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to reach the column names in a huge .RData file without loading it

2016-03-19 Thread Jan T Kim
On Wed, Mar 16, 2016 at 03:18:27PM -0400, Duncan Murdoch wrote:
> On 16/03/2016 1:40 PM, Jan Kim wrote:
> >Barry: that's an interesting hack.
> >
> >I do feel compelled to make two comments, though, regarding the
> >general issue rather than the scraping idea:
> >
> >(1) If your situation is that that image (.RData file) is the only
> >copy of the data, you'll need to rescue the data from that as soon as
> >possible anyway. Something like
> >
> > load(".RData");
> > write.csv(mydataframe, file = "mydata.csv");
> >
> >should do this trick. It will be slow, but you'll need to do it just
> >once, so you might as well enjoy your coffee while you wait. From that
> >point on, work with the mydata.csv file for getting at the colnames
> >(and anything else as well).
> >
> >(2) If there's any chance / risk that scraping data off images is not
> >a one-off, the time to prevent that from catching on is now. If data is
> >of any value at all, it should be handled in a sane, portable, textual
> >format. For tabular data, csv is normally adequate or at least good
> >enough, but .RData images are never a good idea.
> 
> I agree with the sentiment, but not with the choice of .csv as a
> "sane, portable, textual format".  CSV has no type information
> included, so strings that contain only digits can turn into numbers
> (and get rounded in the process), things that look like
> dates can get converted to different formats, etc.

I entirely agree. In hindsight, I should have stated that the .RData files,
as well as the R code to load and extract stuff from them, should be stored
permanently and documented.

> The .RData format has the disadvantages of being hard to use outside
> R, but at least it is usable in R.

yes -- that's why I thought it's a good idea to use R to pluck out the
valuable data, so (1) they can still be accessed even if the .RData
format changes and (2) they're in their own file, separated from the
(potentially homungous, see my P.S.) amount of other stuff caught up
in the image.

But to reiterate, the .RData file should be secured as well if that's
the only remaining primary / original source of the data.

> I don't know what I'd recommend if I wanted a portable textual
> format.  JSON is close, but it can't handle the full
> range of data that R can handle (e.g. no Inf).  dput() on a
> dataframe is text, but nothing but R can read it.

yes, that's the problem with "JSON", it's a JavaScript but not really
an object notation, as it doesn't store class structure metadata.

So again, the best bet is to secure multiple levels, the .RDdata
image to preserve the R types, the R script to be able to identify
the relevant variable(s), and the text version to avoid depending on
availablility of R / an R version still able to read the image format.

Best regards, Jan


> Duncan Murdoch
> 
> 
> >
> >Best regards, Jan
> >
> >P.S.: I've seen .RData images containing many months worth of interactive
> >work, and multiple variants of data frames in variables with more or less
> >similar names, so the set of strings scraped off these will be rather more
> >bewildering than in Barry's clean example.
> >
> >
> >On Wed, Mar 16, 2016 at 05:17:25PM +, Barry Rowlingson wrote:
> >> You *might* be able to get them from the raw file...
> >>
> >> First, I don't quite know what "colnames" of an .RData file means.
> >> "colnames" are the column names of a matrix (or data frame), so I'll
> >> assume your .RData file contains exactly one data frame and you want
> >> to column names of it.
> >>
> >> So let's create one of those:
> >>
> >>
> >> mydataframe = data.frame(mylongnamehere=runif(3),
> >> anotherlongname=runif(3), z=runif(3), y=runif(3),
> >> aasdkjhasdkjhaskdj=runif(3))
> >> save(mydataframe, file="./test.RData")
> >>
> >> Now I'm going to use some Unix utilities to see if there's any
> >> identifiable strings in the file. .RData files are by default
> >> compressed using `gzip`, so I'll `gunzip` them and pipe it into
> >> `strings`:
> >>
> >> $ gunzip -c test.RData | strings -t d
> >>   0 RDX2
> >>  35 mydataframe
> >> 230 names
> >> 251 mylongnamehere
> >> 273 anotherlongname
> >> 314 aasdkjhasdkjhaskdj
> >> 347 row.names
> >> 389 class
> >> 410 data.frame
> >>
> >>
> >>   - thats found the object name (mydataframe) and most of the column
> >> names except the short

Re: [R] How to reach the column names in a huge .RData file without loading it

2016-03-19 Thread Jan Kim
Barry: that's an interesting hack.

I do feel compelled to make two comments, though, regarding the
general issue rather than the scraping idea:

(1) If your situation is that that image (.RData file) is the only
copy of the data, you'll need to rescue the data from that as soon as
possible anyway. Something like

load(".RData");
write.csv(mydataframe, file = "mydata.csv");

should do this trick. It will be slow, but you'll need to do it just
once, so you might as well enjoy your coffee while you wait. From that
point on, work with the mydata.csv file for getting at the colnames
(and anything else as well).

(2) If there's any chance / risk that scraping data off images is not
a one-off, the time to prevent that from catching on is now. If data is
of any value at all, it should be handled in a sane, portable, textual
format. For tabular data, csv is normally adequate or at least good
enough, but .RData images are never a good idea.

Best regards, Jan

P.S.: I've seen .RData images containing many months worth of interactive
work, and multiple variants of data frames in variables with more or less
similar names, so the set of strings scraped off these will be rather more
bewildering than in Barry's clean example.


On Wed, Mar 16, 2016 at 05:17:25PM +, Barry Rowlingson wrote:
> You *might* be able to get them from the raw file...
> 
> First, I don't quite know what "colnames" of an .RData file means.
> "colnames" are the column names of a matrix (or data frame), so I'll
> assume your .RData file contains exactly one data frame and you want
> to column names of it.
> 
> So let's create one of those:
> 
> 
> mydataframe = data.frame(mylongnamehere=runif(3),
> anotherlongname=runif(3), z=runif(3), y=runif(3),
> aasdkjhasdkjhaskdj=runif(3))
> save(mydataframe, file="./test.RData")
> 
> Now I'm going to use some Unix utilities to see if there's any
> identifiable strings in the file. .RData files are by default
> compressed using `gzip`, so I'll `gunzip` them and pipe it into
> `strings`:
> 
> $ gunzip -c test.RData | strings -t d
>   0 RDX2
>  35 mydataframe
> 230 names
> 251 mylongnamehere
> 273 anotherlongname
> 314 aasdkjhasdkjhaskdj
> 347 row.names
> 389 class
> 410 data.frame
> 
> 
>   - thats found the object name (mydataframe) and most of the column
> names except the short ones, which are too short for `strings` to
> recognise. But if your names are long enough (4 or more chars, I
> think) they'll show up.
> 
>  Of course you'll have to filter them out from all the other string
> output, but they should all appear shortly after the word "names",
> since the colnames of a data frame are the "names" attribute of the
> data.
> 
>  If you don't have a Unix or Mac machine handy you can get these
> utilities on Windows via Cygwin but that's another story...
> 
>  Barry
> 
> 
> 
> 
> 
> 
> 
> 
> On Wed, Mar 16, 2016 at 3:59 PM, Lida Zeighami <lid.z...@gmail.com> wrote:
> > Hi,
> > I have a huge .RData file and I need just to get the colnames of it. so is
> > there any way to reach the column names without loading or reading the
> > whole file?
> > Since the file is so big and I need to repeat this process several times,
> > so it takes so long to load the file first and then take the colnames!
> >
> > Thanks
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
 +- Jan T. Kim ---+
 | email: jtt...@gmail.com|
 | WWW:   http://www.jtkim.dreamhosters.com/  |
 *-=<  hierarchical systems are for files, not for humans  >=-*

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] treating integer(0) and NULL in conditions and loops

2016-03-11 Thread Jan Kacaba
Hello, I have following problem in loops. It occurred to me multiple times
bellow is an example.  Inside if() I have sometimes function f(x) which may
return integer(0).  If I test f(x)>1 and f(x)=integer(0) I get error. Maybe
it can be solved more eloquently without loop or swithces. I don't know.

Example:

a=c("ab","abc","abcd","abcde","abcdefghjk") # vector from which new strings
will be constructed
svec=NULL # vector of string
rz=NULL # string

for (i in 1:10){
if (nchar(rz)>6){
  svec[i]=rz
  rz=NULL
}

if (nchar(a[i])+nchar(rz))<6){
  rz=paste(rz,a[i])
}

if (nchar(rz)+nchar(a[i+1]>6){
  svec[i]=rz
  rz=NULL
}
}

I'm not interested how to treat nchar() function in particular but general
function. One solution which comes to mind is to redefine function for
example nchar() function like this:

new.nchar=function(x){
if (length(nchar(rz))==0){z=0}
if (length(nchar(rz))>0){z=nchar(rz)}
return(z)
}

Is it correct way of doing this or is there a better way without the need
of redefining new function?


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] assign a vector to list sequence

2016-03-09 Thread Jan Kacaba
Hello I would like to assign a vector to list sequence. I'm trying my code
bellow, but the output is not what inteded.

# my code
mls=vector(mode="list") # my list
cseq=c(1:3) # my vector
mls[cseq]=cseq

I get following:
[[1]]
[1] 1
[[1]]
[2] 2
[[1]]
[2] 3

What I need is this:
[[1]]
[1] 1 2 3
[[1]]
[2] 1 2 3
[[1]]
[2] 1 2 3

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   3   4   5   >