Re: [R] Mailinglist

2019-01-06 Thread K. Elo
Hi!

Not having a data chunk prevents me from testing abit, but maybe you
should take a look on:

?table
?xtabs

to start with.

But as already suggested by other users, a small data set would be of
great help :)

HTH,
Kimmo

su, 2019-01-06 kello 13:49 -0500, Rachel Thompson kirjoitti:
> Hi Rich,
> 
> I really feel lost at this point.
> I need a code that helps me count the phone activity
> level(high/low/none),
> the screen activity (on/off) and the amount calls and SMS of each
> subject.
> 
> 1. I want to have a summary of how many times a specific subject got
> called
> (CallLogProbe)
> 2. I want to have a summary of how many times a specific subject got
> a text
> message (SMS probe)
> 3. I want to have a summary of how many times a specific subject
> - Turned their screen on - True  (ScreenProbe)
> - Or did not turn their screen on - False (ScreenProbe)
> 4.  I want to have a summary of the activity level of a specific
> subject
> - Activity level - none (ActivityProbe)
> - Activity level- low (ActivityProbe)
> - Activity level - High  (ActivityProbe)
> 
> I want to do this for all the 36 subjects(Participants).
> In the end, I have to define the percentages and cutoff points of
> what is
> considered low-medium-high, based on what the results of all the
> subjects
> are. So I am able to see if a specific subject has low social
> interaction
> etc.
> 
> I have tried a lot, with the help of youtube etc. But I feel as if I
> am
> trying a lot of things but without clearly knowing if it is the right
> step.
> I have a csv file, but I need to look into what Jeff said about the
> guides.
> So I am able to share it.
> 
> Best.
> 
> 
> On Sun, Jan 6, 2019 at 11:51 AM Rich Shepard <
> rshep...@appl-ecosys.com>
> wrote:
> 
> > On Sun, 6 Jan 2019, Rachel Thompson wrote:
> > 
> > > I am an intern from Amsterdam and I have to do an analysis in R.
> > > I spoke
> > > to my professor in Amsterdam and my supervisor's here in Boston.
> > > But they
> > > are to busy to help. I informed them from the start that I am not
> > 
> > familiar
> > > with R(Rstudio) and they told me that I would receive guidance.
> > > So since
> > > they can not help me, I decided to share my problem online. (It
> > > is a CVS
> > > file imported into R)
> > 
> > Rachel,
> > 
> >I find it interesting that you're put in such a difficult
> > position. I've
> > not followed this thread from the start so my comments might be
> > redundant
> > or
> > inappropriate.
> > 
> >If you can, describe the problem. That is, what are you being
> > asked to
> > find and what are the available data? This information helps us to
> > guide
> > you
> > to learning the mechanics for accomplishing your task with R.
> > 
> > Regards,
> > 
> > Rich
> > 
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> > 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[ESS] ess-noweb-font-lock-mode: emacs hangs using uncomment-region with math environments

2019-01-06 Thread Braun, Michael via ESS-help
In a .Rnw file, when calling uncomment-region on a region that contains a LaTeX 
math environment (such as align), Emacs will hang, requiring a Force Quit.  I 
am using Aquamacs Emacs 3.5, ESS 18.10.2, and MacOS 10.14.2, but this problem 
has persisted since at least Aquamacs 3.2 and ESS 16.10.   I’ve been wrestling 
with this issue since at least 2016, but now it’s time to get some help.

To replicate, save the following content in a file with a .Rnw extension. Then, 
select a region that includes the equation, and run comment-region, and then 
uncomment-region.  I have these functions are bound to M-; or C-c ; .

--
\documentclass{article}
\usepackage{amsmath}

\begin{document}

On the following equation, try comment-region, and then uncomment-region.
The uncomment-region call is what hangs the process.

\begin{align}
  1+1=2
\end{align}

\end{document}
---

My longstanding workaround is to either toggle ess-noweb-font-lock-mode off, or 
toggle font-lock-mode on, right after opening the file.  This makes the problem 
with uncomment-region disappear. But then I lose the ESS font-lock features.

Interestingly, this does not crash Emacs  if the file is saved with a .tex 
extension.  Also, in a .tex file, comment-region prefixes lines with %, but in 
a .Rnw file, the comment prefix is %%. I’m not sure if that’s relevant, but it 
might be, and I’d like to find a way to change that behavior as well.

Any thoughts? 

Thanks,

Michael Braun
braunm _at_ smu.edu







__
ESS-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/ess-help


Re: [R] Need help for logical expression

2019-01-06 Thread Jeff Newmiller

See below

On Mon, 7 Jan 2019, roslinazairimah zakaria wrote:


Dear all,

I just have a simple problem here.

I generate a sample data as follows:
set.seed(123456)
r1 <- sample(1:100,100 ,replace=T)
r2 <- sample(1:100,100 ,replace=T)
r3 <- sample(1:100,100 ,replace=T)

R <- cbind(r1,r2,r3); head(R)

R <- cbind(r1,r2,r3); head(R)

r1 r2 r3
[1,] 80  4 20
[2,] 76 66 14
[3,] 40 32 87
[4,] 35 19 24
[5,] 37 63 12
[6,] 20 52 42
sum_r <- rowSums(R)
all_pct <- round(R/sum_r*100,0); head(all_pct)

all_pct <- round(R/sum_r*100,0); head(all_pct)

r1 r2 r3
[1,] 77  4 19
[2,] 49 42  9
[3,] 25 20 55
[4,] 45 24 31
[5,] 33 56 11
[6,] 18 46 37

I would like to count how many of all_pct data satisfy this condition as
follows:

dt_all <- ifelse(all_pct[,1] >= 45 & all_pct[,1] > all_pct[,2] &
all_pct[,1] > all_pct[,3], 1, 0)

Note that data of all_pct[,1] >= 45  and at the same time greater
than all_pct[,2] and all_pct[,3].

How do I count how many satisfy the conditions?


sum( dt_all )

Did I misunderstand?



--
*Roslinazairimah Zakaria*
*Tel: +609-5492370; Fax. No.+609-5492766*

*Email: roslinazairi...@ump.edu.my ;
roslina...@gmail.com *
Faculty of Industrial Sciences & Technology
University Malaysia Pahang
Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia

[[alternative HTML version deleted]]


HTML email will eventually impede others trying to understand your code... 
the sooner you stop posting HTML the better.




__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Need help for logical expression

2019-01-06 Thread roslinazairimah zakaria
Dear all,

I just have a simple problem here.

I generate a sample data as follows:
set.seed(123456)
r1 <- sample(1:100,100 ,replace=T)
r2 <- sample(1:100,100 ,replace=T)
r3 <- sample(1:100,100 ,replace=T)

R <- cbind(r1,r2,r3); head(R)
> R <- cbind(r1,r2,r3); head(R)
 r1 r2 r3
[1,] 80  4 20
[2,] 76 66 14
[3,] 40 32 87
[4,] 35 19 24
[5,] 37 63 12
[6,] 20 52 42
sum_r <- rowSums(R)
all_pct <- round(R/sum_r*100,0); head(all_pct)
> all_pct <- round(R/sum_r*100,0); head(all_pct)
 r1 r2 r3
[1,] 77  4 19
[2,] 49 42  9
[3,] 25 20 55
[4,] 45 24 31
[5,] 33 56 11
[6,] 18 46 37

I would like to count how many of all_pct data satisfy this condition as
follows:

dt_all <- ifelse(all_pct[,1] >= 45 & all_pct[,1] > all_pct[,2] &
all_pct[,1] > all_pct[,3], 1, 0)

Note that data of all_pct[,1] >= 45  and at the same time greater
than all_pct[,2] and all_pct[,3].

How do I count how many satisfy the conditions?

-- 
*Roslinazairimah Zakaria*
*Tel: +609-5492370; Fax. No.+609-5492766*

*Email: roslinazairi...@ump.edu.my ;
roslina...@gmail.com *
Faculty of Industrial Sciences & Technology
University Malaysia Pahang
Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mailinglist

2019-01-06 Thread Richard M. Heiberger
Questions like this
1. I want to have a summary of how many times a specific subject got called
(CallLogProbe)

suggest that you should look at the table function.  See
?table
and run the examples.
They show how to get one-way frequency tables and two-way contingency
tables.

If you have followup questions for the list, you can use the examples in
?table as your starting point.
That way you don't need to worry about sharing your own data.


On Sun, Jan 6, 2019 at 1:59 PM Rachel Thompson <
rachel.thomp...@student.uva.nl> wrote:

> Hi Rich,
>
> I really feel lost at this point.
> I need a code that helps me count the phone activity level(high/low/none),
> the screen activity (on/off) and the amount calls and SMS of each subject.
>
> 1. I want to have a summary of how many times a specific subject got called
> (CallLogProbe)
> 2. I want to have a summary of how many times a specific subject got a text
> message (SMS probe)
> 3. I want to have a summary of how many times a specific subject
> - Turned their screen on - True  (ScreenProbe)
> - Or did not turn their screen on - False (ScreenProbe)
> 4.  I want to have a summary of the activity level of a specific subject
> - Activity level - none (ActivityProbe)
> - Activity level- low (ActivityProbe)
> - Activity level - High  (ActivityProbe)
>
> I want to do this for all the 36 subjects(Participants).
> In the end, I have to define the percentages and cutoff points of what is
> considered low-medium-high, based on what the results of all the subjects
> are. So I am able to see if a specific subject has low social interaction
> etc.
>
> I have tried a lot, with the help of youtube etc. But I feel as if I am
> trying a lot of things but without clearly knowing if it is the right step.
> I have a csv file, but I need to look into what Jeff said about the guides.
> So I am able to share it.
>
> Best.
>
>
> On Sun, Jan 6, 2019 at 11:51 AM Rich Shepard 
> wrote:
>
> > On Sun, 6 Jan 2019, Rachel Thompson wrote:
> >
> > > I am an intern from Amsterdam and I have to do an analysis in R. I
> spoke
> > > to my professor in Amsterdam and my supervisor's here in Boston. But
> they
> > > are to busy to help. I informed them from the start that I am not
> > familiar
> > > with R(Rstudio) and they told me that I would receive guidance. So
> since
> > > they can not help me, I decided to share my problem online. (It is a
> CVS
> > > file imported into R)
> >
> > Rachel,
> >
> >I find it interesting that you're put in such a difficult position.
> I've
> > not followed this thread from the start so my comments might be redundant
> > or
> > inappropriate.
> >
> >If you can, describe the problem. That is, what are you being asked to
> > find and what are the available data? This information helps us to guide
> > you
> > to learning the mechanics for accomplishing your task with R.
> >
> > Regards,
> >
> > Rich
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mailinglist

2019-01-06 Thread Rich Shepard

On Sun, 6 Jan 2019, Rachel Thompson wrote:


I need a code that helps me count the phone activity level(high/low/none),
the screen activity (on/off) and the amount calls and SMS of each subject.

1. I want to have a summary of how many times a specific subject got called
(CallLogProbe)
2. I want to have a summary of how many times a specific subject got a text
message (SMS probe)
3. I want to have a summary of how many times a specific subject
- Turned their screen on - True  (ScreenProbe)
- Or did not turn their screen on - False (ScreenProbe)
4.  I want to have a summary of the activity level of a specific subject
- Activity level - none (ActivityProbe)
- Activity level- low (ActivityProbe)
- Activity level - High  (ActivityProbe)

I want to do this for all the 36 subjects(Participants).
In the end, I have to define the percentages and cutoff points of what is
considered low-medium-high, based on what the results of all the subjects
are. So I am able to see if a specific subject has low social interaction
etc.


Rachel,

  Those more experienced with R than am I will probably offer their
thoughts, too.

  It looks to me that you want counts of various parameters for each
participant, and perhaps groups of participants. When I read this I see SQL
SELECT count() statements on database tables, not descriptive or explanatory
statistical results.

  Not knowing your computer's OS or whether there is a relational database
magagement system installed I can't offer specific suggestions. A database
would provide counts of each *Probe as well as combinations. If you want to
go that route contact me off the mail list and I'll suggest mail lists for
help if you use Microsoft products as I know only PostgreSQL and SQLite.

Regards,

Rich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mailinglist

2019-01-06 Thread Hasan Diwan
Maybe you could put the CSV in a gist or something? -- H

On Sun, 6 Jan 2019 at 10:58, Rachel Thompson 
wrote:

> Hi Rich,
>
> I really feel lost at this point.
> I need a code that helps me count the phone activity level(high/low/none),
> the screen activity (on/off) and the amount calls and SMS of each subject.
>
> 1. I want to have a summary of how many times a specific subject got called
> (CallLogProbe)
> 2. I want to have a summary of how many times a specific subject got a text
> message (SMS probe)
> 3. I want to have a summary of how many times a specific subject
> - Turned their screen on - True  (ScreenProbe)
> - Or did not turn their screen on - False (ScreenProbe)
> 4.  I want to have a summary of the activity level of a specific subject
> - Activity level - none (ActivityProbe)
> - Activity level- low (ActivityProbe)
> - Activity level - High  (ActivityProbe)
>
> I want to do this for all the 36 subjects(Participants).
> In the end, I have to define the percentages and cutoff points of what is
> considered low-medium-high, based on what the results of all the subjects
> are. So I am able to see if a specific subject has low social interaction
> etc.
>
> I have tried a lot, with the help of youtube etc. But I feel as if I am
> trying a lot of things but without clearly knowing if it is the right step.
> I have a csv file, but I need to look into what Jeff said about the guides.
> So I am able to share it.
>
> Best.
>
>
> On Sun, Jan 6, 2019 at 11:51 AM Rich Shepard 
> wrote:
>
> > On Sun, 6 Jan 2019, Rachel Thompson wrote:
> >
> > > I am an intern from Amsterdam and I have to do an analysis in R. I
> spoke
> > > to my professor in Amsterdam and my supervisor's here in Boston. But
> they
> > > are to busy to help. I informed them from the start that I am not
> > familiar
> > > with R(Rstudio) and they told me that I would receive guidance. So
> since
> > > they can not help me, I decided to share my problem online. (It is a
> CVS
> > > file imported into R)
> >
> > Rachel,
> >
> >I find it interesting that you're put in such a difficult position.
> I've
> > not followed this thread from the start so my comments might be redundant
> > or
> > inappropriate.
> >
> >If you can, describe the problem. That is, what are you being asked to
> > find and what are the available data? This information helps us to guide
> > you
> > to learning the mechanics for accomplishing your task with R.
> >
> > Regards,
> >
> > Rich
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
OpenPGP:
https://sks-keyservers.net/pks/lookup?op=get=0xFEBAD7FFD041BBA1
If you wish to request my time, please do so using
*bit.ly/hd1AppointmentRequest
*.
Si vous voudrais faire connnaisance, allez a *bit.ly/hd1AppointmentRequest
*.

Sent
from my mobile device
Envoye de mon portable

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mailinglist

2019-01-06 Thread Rachel Thompson
Hi Rich,

I really feel lost at this point.
I need a code that helps me count the phone activity level(high/low/none),
the screen activity (on/off) and the amount calls and SMS of each subject.

1. I want to have a summary of how many times a specific subject got called
(CallLogProbe)
2. I want to have a summary of how many times a specific subject got a text
message (SMS probe)
3. I want to have a summary of how many times a specific subject
- Turned their screen on - True  (ScreenProbe)
- Or did not turn their screen on - False (ScreenProbe)
4.  I want to have a summary of the activity level of a specific subject
- Activity level - none (ActivityProbe)
- Activity level- low (ActivityProbe)
- Activity level - High  (ActivityProbe)

I want to do this for all the 36 subjects(Participants).
In the end, I have to define the percentages and cutoff points of what is
considered low-medium-high, based on what the results of all the subjects
are. So I am able to see if a specific subject has low social interaction
etc.

I have tried a lot, with the help of youtube etc. But I feel as if I am
trying a lot of things but without clearly knowing if it is the right step.
I have a csv file, but I need to look into what Jeff said about the guides.
So I am able to share it.

Best.


On Sun, Jan 6, 2019 at 11:51 AM Rich Shepard 
wrote:

> On Sun, 6 Jan 2019, Rachel Thompson wrote:
>
> > I am an intern from Amsterdam and I have to do an analysis in R. I spoke
> > to my professor in Amsterdam and my supervisor's here in Boston. But they
> > are to busy to help. I informed them from the start that I am not
> familiar
> > with R(Rstudio) and they told me that I would receive guidance. So since
> > they can not help me, I decided to share my problem online. (It is a CVS
> > file imported into R)
>
> Rachel,
>
>I find it interesting that you're put in such a difficult position. I've
> not followed this thread from the start so my comments might be redundant
> or
> inappropriate.
>
>If you can, describe the problem. That is, what are you being asked to
> find and what are the available data? This information helps us to guide
> you
> to learning the mechanics for accomplishing your task with R.
>
> Regards,
>
> Rich
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mailinglist

2019-01-06 Thread Rachel Thompson
Hi Rui,

Thank you, I willl look into it.

Best,

Rachel



On Sun, Jan 6, 2019 at 12:27 PM Rui Barradas  wrote:

> Hello,
>
> In many continental European countries, such as mine, the function to
> use is
>
> read.csv2
>
> It defaults to
>
> sep = ";", dec = ","
>
> Note that these functions are in fact calls to read.table with special
> default arguments. Another default that changes is header = TRUE.
> You might also want to set stringsAsFactors = FALSE since the default
> value TRUE is a common source for errors.
>
> Hope this helps,
>
> Rui Barradas
>
> Às 16:45 de 06/01/2019, Michael Dewey escreveu:
> > Dear Rachel
> >
> > Not sure if this is going to help but if it is a csv file then
> > read.csv() is your friend. Read the help first in case you need to
> > specify what is being used for the decimal point and the separator as if
> > it is from the Netherlands they may not be the default settings.
> >
> > michael
> >
> > On 06/01/2019 16:37, Rachel Thompson wrote:
> >> Hi Jeff,
> >>
> >> Thanks for your email.
> >> I am an intern from Amsterdam and I have to do an analysis in R. I
> >> spoke to
> >> my professor in Amsterdam and my supervisor's here in Boston. But they
> >> are
> >> to busy to help. I informed them from the start that I am not familiar
> >> with
> >> R(Rstudio) and they told me that I would receive guidance. So since they
> >> can not help me, I decided to share my problem online.
> >> (It is a CVS file imported into R)
> >>
> >> Please understand that I am new to this. I will unsubscribe to the
> >> mailing
> >> list if my question does not belong here.
> >>
> >> Thanks,
> >>
> >> Rachel
> >>
> >> On Sun, Jan 6, 2019 at 11:01 AM Jeff Newmiller <
> jdnew...@dcn.davis.ca.us>
> >> wrote:
> >>
> >>> I would not want to leave the impression that I think the task at
> >>> hand is
> >>> merely tedious... my point is that there are numerous steps involved
> and
> >>> each step depends on information that has not been communicated to the
> >>> list, and there is a learning curve even in knowing what to include
> >>> in an
> >>> email question. What I do think is that knowing enough basic R syntax
> to
> >>> express small bits of the problem in R will be a vast improvement over
> >>> attempting to use only English descriptions, and Rachel has to bridge
> >>> that
> >>> initial gap.
> >>>
> >>> For example, some images of data were apparently sent to Jim only,
> >>> yet he
> >>> still does not know in what format the data file is stored, so that
> >>> technique was not very effective. One way for the question to become
> >>> more
> >>> focused is for Rachel to study up on her own how to import data and
> >>> provide
> >>> us with a "dput" (see the StackOverflow discussion I referenced
> >>> before) of
> >>> a small sample of data. Another is for Rachel to use basic R syntax to
> >>> create an anonymous data set from scratch (also outlined in the SO
> >>> discussion). These approaches allow us to keep the focus of our mailing
> >>> list discussion on manipulating the data into summaries. Another
> >>> approach
> >>> is to re-focus the question on importing data by supplying a download
> >>> link
> >>> to the data so we can make suggestions as to what R commands will
> handle
> >>> this data in its raw form. In any case, we cannot leapfrog over the
> >>> data to
> >>> the analysis as the question stands.
> >>>
> >>> Given the above, I have to wonder why Rachel hasn't simply used the
> tool
> >>> she is familiar with... SPSS... to do this? If it is because this is an
> >>> academic assignment to learn R then she should be talking to her
> >>> institutional support (instructor/teaching assistant/tutoring staff)
> >>> anyway
> >>> since there is a no-homework policy on this list (and that avenue would
> >>> have the benefit of being conducted orally and most likely in her
> native
> >>> language).
> >>>
> >>>
> >>> On January 6, 2019 1:12:46 AM PST, Jim Lemon 
> >>> wrote:
>  Hi Rachel,
>  It looks to me as though the first thing you want to do is to get your
>  data, which you attach as images, into a data frame. If these are flat
>  files like CSV or TAB, you should be able to read them in with some
>  variant of the read.table function. If Excel, look at the various
>  Excel import packages. Then you can operate on the data frame by doing
>  things like tabulating Participant ID against the code for SMS or call
>  (which I assume are those 3000+ numbers). You can take the differences
>  in what look like POSIX time values between successive TRUE and FALSE
>  screen values to get the duration of screen activity and it looks like
>  participant activity is recorded at regular intervals. As Jeff
>  suggested, this is really just boring work figuring out how to extract
>  the events:
> 
>  call_indices<-which(Probetype == xxCallLogProbe & ValueSpecified
>  == _id  & Valuedetailed ==3271)
> 
>  using suitable logical 

Re: [R] data frame transformation

2019-01-06 Thread Bert Gunter
... and my reordering of column indices was unnecessary:
merge(dat, d, all.y = TRUE)
will do.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, Jan 6, 2019 at 5:16 AM Andras Farkas via R-help <
r-help@r-project.org> wrote:

> Hello Everyone,
>
> would you be able to assist with some expertise on how to get the
> following done in a way that can be applied to a data set with different
> dimensions and without all the line items here?
>
> we have:
>
> id<-c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may differ
> of course in real data set, usually in magnitude of 1
>
> letter<-c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),2),
>
> sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#number
> of unique "letters" is less than 4000 in real data set and they are no
> duplicates within same ID
> weight<-c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2),
>   sample(c(1:30),4),sample(c(1:30),4))#number of unique weights is
> below 50 in real data set and they are no duplicates within same ID
>
>
> data<-data.frame(id=id,letter=letter,weight=weight)
>
> #goal is to get the following transformation where a column is added for
> each unique letter and the weight is pulled into the column if the letter
> exist within the ID, otherwise NA
> #so we would get datatransform like below but without the many steps
> described here
>
> datatransfer<-data.frame(data,apply(data[2],2,function(x)
> ifelse(x=="A",data$weight,NA)))
> datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="B",data$weight,NA)))
> datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="C",data$weight,NA)))
> datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="D",data$weight,NA)))
> datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="E",data$weight,NA)))
>
> colnames(datatransfer)<-c("id","weight","letter","A","B","C","D","E")
> much appreciate the help,
>
> thanks
>
> Andras
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Webshot failed to take snapshot in Ubuntu machine

2019-01-06 Thread Christofer Bogaso
Thanks Martin,

I reinstalled PhantomJS and now it works fine. Regards,

On Thu, Dec 20, 2018 at 5:30 PM Martin Maechler 
wrote:

> > Marc Girondot via R-help
> > on Tue, 18 Dec 2018 13:53:34 +0100 writes:
>
> > Hi Christofer, I just try on MacOSX and ubuntu and it
> > works on both:
>
> > For ubuntu:
> >> Sys.info()
> >sysname
> >   "Linux"
> >   release
> >   "4.15.0-42-generic"
> >   version "#45-Ubuntu
> > SMP Thu Nov 15 19:32:57 UTC 2018"
> >  nodename
> >"lepidochelys"
> >   machine
> >  "x86_64"
>
> > Not sure what to do...
> > Marc
>
> Hmm, if I try it (on my Linux desktop), I get
>
>   > library(webshot)
>   > url <- "
> https://www.bseindia.com/stock-share-price/asian-paints-ltd/asianpaint/500820/
> "
>   > webshot(url, 'bb.pdf')
>   PhantomJS not found. You can install it with
> webshot::install_phantomjs(). If it is installed, please make sure the
> phantomjs executable can be found via the PATH variable.
>   NULL
>
> So, it is clear this relies on extra javascript based software
> being available on your computer, *and* having that correctly in
> your PATH.
>
> On my linux system, I then did
>webshot::install_phantomjs()
> and that downloaded things and installed a 67 Megabyte
> executable in my PATH ... which then subsequently worked.
>
> On that Linux system it did *not* work, try
>
>   system("which phantomjs")
>
> and you should see that it gets a version of 'phantomjs' on your
> computer, i.e., the one that  webshot() will then try to use and
> somehow fails.
>
> I'd recommend you run   webshot::install_phantomjs()
> which then should install a "better" version of the 'phantomjs'
> executable that then *should* work ..
>
> Let us know if this helped (or why not).
>
> Best,
> Martin Maechler
> ETH Zurich
>
> > Le 18/12/2018 à 13:37, Christofer Bogaso a écrit :
> >> Hi,
> >>
> >> I was using webshot package to take snapshot of a webpage
> >> as below:
> >>
> >> library(webshot) webshot('
> >>
> https://www.bseindia.com/stock-share-price/asian-paints-ltd/asianpaint/500820/
> ',
> >> 'bb.pdf')
> >>
> >> However what I see is a Blank PDF file is saved.
> >>
> >> However if I use the same code in my windows machine it
> >> is able to produce correct snapshot.
> >>
> >> Below is my system information
> >>> Sys.info()
> >> sysname "Linux" release "4.4.0-139-generic" version
> >> "#165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018" nodename
> >> "ubuntu-s-2vcpu-4gb-blr1-01" machine "x86_64" login
> >> "root" user "root" effective_user "root"
> >>
> >> Any idea what went wrong would be highly helpful.
> >>
> >> Thanks,
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and
> >> more, see https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html and provide
> >> commented, minimal, self-contained, reproducible code.
>
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and
> > more, see https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html and provide
> > commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame transformation

2019-01-06 Thread Bert Gunter
Like this (using base R only)?

dat<-data.frame(id=id,letter=letter,weight=weight) # using your data

ud <- unique(dat$id)
ul = unique(dat$letter)
d <- with(dat,
  data.frame(
  letter = rep(ul, e = length(ud)),
  id = rep(ud, length(ul))
  ) )

 merge(dat[,c(2,1,3)],d, all.y = TRUE)
## resulting in:

   letter id weight
1   A  1 25
2   A  2 28
3   A  3 14
4   A  4 27
5   A  5 NA
6   B  1 13
7   B  2 14
8   B  3 NA
9   B  4 15
10  B  5  2
11  C  1 NA
12  C  2 NA
13  C  3 NA
14  C  4 NA
15  C  5 25
16  D  1 24
17  D  2 18
18  D  3 NA
19  D  4 29
20  D  5 27
21  E  1 NA
22  E  2  2
23  E  3 20
24  E  4 25
25  E  5 28


Cheers,

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, Jan 6, 2019 at 5:16 AM Andras Farkas via R-help <
r-help@r-project.org> wrote:

> Hello Everyone,
>
> would you be able to assist with some expertise on how to get the
> following done in a way that can be applied to a data set with different
> dimensions and without all the line items here?
>
> we have:
>
> id<-c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may differ
> of course in real data set, usually in magnitude of 1
>
> letter<-c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),2),
>
> sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#number
> of unique "letters" is less than 4000 in real data set and they are no
> duplicates within same ID
> weight<-c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2),
>   sample(c(1:30),4),sample(c(1:30),4))#number of unique weights is
> below 50 in real data set and they are no duplicates within same ID
>
>
> data<-data.frame(id=id,letter=letter,weight=weight)
>
> #goal is to get the following transformation where a column is added for
> each unique letter and the weight is pulled into the column if the letter
> exist within the ID, otherwise NA
> #so we would get datatransform like below but without the many steps
> described here
>
> datatransfer<-data.frame(data,apply(data[2],2,function(x)
> ifelse(x=="A",data$weight,NA)))
> datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="B",data$weight,NA)))
> datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="C",data$weight,NA)))
> datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="D",data$weight,NA)))
> datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="E",data$weight,NA)))
>
> colnames(datatransfer)<-c("id","weight","letter","A","B","C","D","E")
> much appreciate the help,
>
> thanks
>
> Andras
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mailinglist

2019-01-06 Thread Rui Barradas

Hello,

In many continental European countries, such as mine, the function to 
use is


read.csv2

It defaults to

sep = ";", dec = ","

Note that these functions are in fact calls to read.table with special 
default arguments. Another default that changes is header = TRUE.
You might also want to set stringsAsFactors = FALSE since the default 
value TRUE is a common source for errors.


Hope this helps,

Rui Barradas

Às 16:45 de 06/01/2019, Michael Dewey escreveu:

Dear Rachel

Not sure if this is going to help but if it is a csv file then 
read.csv() is your friend. Read the help first in case you need to 
specify what is being used for the decimal point and the separator as if 
it is from the Netherlands they may not be the default settings.


michael

On 06/01/2019 16:37, Rachel Thompson wrote:

Hi Jeff,

Thanks for your email.
I am an intern from Amsterdam and I have to do an analysis in R. I 
spoke to
my professor in Amsterdam and my supervisor's here in Boston. But they 
are
to busy to help. I informed them from the start that I am not familiar 
with

R(Rstudio) and they told me that I would receive guidance. So since they
can not help me, I decided to share my problem online.
(It is a CVS file imported into R)

Please understand that I am new to this. I will unsubscribe to the 
mailing

list if my question does not belong here.

Thanks,

Rachel

On Sun, Jan 6, 2019 at 11:01 AM Jeff Newmiller 
wrote:

I would not want to leave the impression that I think the task at 
hand is

merely tedious... my point is that there are numerous steps involved and
each step depends on information that has not been communicated to the
list, and there is a learning curve even in knowing what to include 
in an

email question. What I do think is that knowing enough basic R syntax to
express small bits of the problem in R will be a vast improvement over
attempting to use only English descriptions, and Rachel has to bridge 
that

initial gap.

For example, some images of data were apparently sent to Jim only, 
yet he

still does not know in what format the data file is stored, so that
technique was not very effective. One way for the question to become 
more
focused is for Rachel to study up on her own how to import data and 
provide
us with a "dput" (see the StackOverflow discussion I referenced 
before) of

a small sample of data. Another is for Rachel to use basic R syntax to
create an anonymous data set from scratch (also outlined in the SO
discussion). These approaches allow us to keep the focus of our mailing
list discussion on manipulating the data into summaries. Another 
approach
is to re-focus the question on importing data by supplying a download 
link

to the data so we can make suggestions as to what R commands will handle
this data in its raw form. In any case, we cannot leapfrog over the 
data to

the analysis as the question stands.

Given the above, I have to wonder why Rachel hasn't simply used the tool
she is familiar with... SPSS... to do this? If it is because this is an
academic assignment to learn R then she should be talking to her
institutional support (instructor/teaching assistant/tutoring staff) 
anyway

since there is a no-homework policy on this list (and that avenue would
have the benefit of being conducted orally and most likely in her native
language).


On January 6, 2019 1:12:46 AM PST, Jim Lemon  
wrote:

Hi Rachel,
It looks to me as though the first thing you want to do is to get your
data, which you attach as images, into a data frame. If these are flat
files like CSV or TAB, you should be able to read them in with some
variant of the read.table function. If Excel, look at the various
Excel import packages. Then you can operate on the data frame by doing
things like tabulating Participant ID against the code for SMS or call
(which I assume are those 3000+ numbers). You can take the differences
in what look like POSIX time values between successive TRUE and FALSE
screen values to get the duration of screen activity and it looks like
participant activity is recorded at regular intervals. As Jeff
suggested, this is really just boring work figuring out how to extract
the events:

call_indices<-which(Probetype == xxCallLogProbe & ValueSpecified
== _id  & Valuedetailed ==3271)

using suitable logical statements and then tabulating them by
ParticipantID. If you know how to do that in SPSS, it won't be too
hard to translate the logical statements into R syntax as above. I may
have misunderstood the variable names, but I think the logic is clear.

Jim

On Sun, Jan 6, 2019 at 4:07 PM Rachel Thompson
 wrote:


Hi Jim,

Thank you for the clarification. Since I only work in SPSS and I am

>from Amsterdam I have had problems with specifying what I am trying to

do in this specific program and also in clear English language.


I think I want to indeed aggregate these events for each subject over

the observation. But in this case several observations.

1. I want to have a summary of how many 

Re: [R] Mailinglist

2019-01-06 Thread Rich Shepard

On Sun, 6 Jan 2019, Rachel Thompson wrote:


I am an intern from Amsterdam and I have to do an analysis in R. I spoke
to my professor in Amsterdam and my supervisor's here in Boston. But they
are to busy to help. I informed them from the start that I am not familiar
with R(Rstudio) and they told me that I would receive guidance. So since
they can not help me, I decided to share my problem online. (It is a CVS
file imported into R)


Rachel,

  I find it interesting that you're put in such a difficult position. I've
not followed this thread from the start so my comments might be redundant or
inappropriate.

  If you can, describe the problem. That is, what are you being asked to
find and what are the available data? This information helps us to guide you
to learning the mechanics for accomplishing your task with R.

Regards,

Rich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mailinglist

2019-01-06 Thread Rachel Thompson
Hi Michael

Thanks, I'll check it out.

Best,

Rachel

On Sun, Jan 6, 2019 at 11:45 AM Michael Dewey 
wrote:

> Dear Rachel
>
> Not sure if this is going to help but if it is a csv file then
> read.csv() is your friend. Read the help first in case you need to
> specify what is being used for the decimal point and the separator as if
> it is from the Netherlands they may not be the default settings.
>
> michael
>
> On 06/01/2019 16:37, Rachel Thompson wrote:
> > Hi Jeff,
> >
> > Thanks for your email.
> > I am an intern from Amsterdam and I have to do an analysis in R. I spoke
> to
> > my professor in Amsterdam and my supervisor's here in Boston. But they
> are
> > to busy to help. I informed them from the start that I am not familiar
> with
> > R(Rstudio) and they told me that I would receive guidance. So since they
> > can not help me, I decided to share my problem online.
> > (It is a CVS file imported into R)
> >
> > Please understand that I am new to this. I will unsubscribe to the
> mailing
> > list if my question does not belong here.
> >
> > Thanks,
> >
> > Rachel
> >
> > On Sun, Jan 6, 2019 at 11:01 AM Jeff Newmiller  >
> > wrote:
> >
> >> I would not want to leave the impression that I think the task at hand
> is
> >> merely tedious... my point is that there are numerous steps involved and
> >> each step depends on information that has not been communicated to the
> >> list, and there is a learning curve even in knowing what to include in
> an
> >> email question. What I do think is that knowing enough basic R syntax to
> >> express small bits of the problem in R will be a vast improvement over
> >> attempting to use only English descriptions, and Rachel has to bridge
> that
> >> initial gap.
> >>
> >> For example, some images of data were apparently sent to Jim only, yet
> he
> >> still does not know in what format the data file is stored, so that
> >> technique was not very effective. One way for the question to become
> more
> >> focused is for Rachel to study up on her own how to import data and
> provide
> >> us with a "dput" (see the StackOverflow discussion I referenced before)
> of
> >> a small sample of data. Another is for Rachel to use basic R syntax to
> >> create an anonymous data set from scratch (also outlined in the SO
> >> discussion). These approaches allow us to keep the focus of our mailing
> >> list discussion on manipulating the data into summaries. Another
> approach
> >> is to re-focus the question on importing data by supplying a download
> link
> >> to the data so we can make suggestions as to what R commands will handle
> >> this data in its raw form. In any case, we cannot leapfrog over the
> data to
> >> the analysis as the question stands.
> >>
> >> Given the above, I have to wonder why Rachel hasn't simply used the tool
> >> she is familiar with... SPSS... to do this? If it is because this is an
> >> academic assignment to learn R then she should be talking to her
> >> institutional support (instructor/teaching assistant/tutoring staff)
> anyway
> >> since there is a no-homework policy on this list (and that avenue would
> >> have the benefit of being conducted orally and most likely in her native
> >> language).
> >>
> >>
> >> On January 6, 2019 1:12:46 AM PST, Jim Lemon 
> wrote:
> >>> Hi Rachel,
> >>> It looks to me as though the first thing you want to do is to get your
> >>> data, which you attach as images, into a data frame. If these are flat
> >>> files like CSV or TAB, you should be able to read them in with some
> >>> variant of the read.table function. If Excel, look at the various
> >>> Excel import packages. Then you can operate on the data frame by doing
> >>> things like tabulating Participant ID against the code for SMS or call
> >>> (which I assume are those 3000+ numbers). You can take the differences
> >>> in what look like POSIX time values between successive TRUE and FALSE
> >>> screen values to get the duration of screen activity and it looks like
> >>> participant activity is recorded at regular intervals. As Jeff
> >>> suggested, this is really just boring work figuring out how to extract
> >>> the events:
> >>>
> >>> call_indices<-which(Probetype == xxCallLogProbe & ValueSpecified
> >>> == _id  & Valuedetailed ==3271)
> >>>
> >>> using suitable logical statements and then tabulating them by
> >>> ParticipantID. If you know how to do that in SPSS, it won't be too
> >>> hard to translate the logical statements into R syntax as above. I may
> >>> have misunderstood the variable names, but I think the logic is clear.
> >>>
> >>> Jim
> >>>
> >>> On Sun, Jan 6, 2019 at 4:07 PM Rachel Thompson
> >>>  wrote:
> 
>  Hi Jim,
> 
>  Thank you for the clarification. Since I only work in SPSS and I am
> >> >from Amsterdam I have had problems with specifying what I am trying to
> >>> do in this specific program and also in clear English language.
> 
>  I think I want to indeed aggregate these events for each subject over
> >>> 

Re: [R] Mailinglist

2019-01-06 Thread Michael Dewey

Dear Rachel

Not sure if this is going to help but if it is a csv file then 
read.csv() is your friend. Read the help first in case you need to 
specify what is being used for the decimal point and the separator as if 
it is from the Netherlands they may not be the default settings.


michael

On 06/01/2019 16:37, Rachel Thompson wrote:

Hi Jeff,

Thanks for your email.
I am an intern from Amsterdam and I have to do an analysis in R. I spoke to
my professor in Amsterdam and my supervisor's here in Boston. But they are
to busy to help. I informed them from the start that I am not familiar with
R(Rstudio) and they told me that I would receive guidance. So since they
can not help me, I decided to share my problem online.
(It is a CVS file imported into R)

Please understand that I am new to this. I will unsubscribe to the mailing
list if my question does not belong here.

Thanks,

Rachel

On Sun, Jan 6, 2019 at 11:01 AM Jeff Newmiller 
wrote:


I would not want to leave the impression that I think the task at hand is
merely tedious... my point is that there are numerous steps involved and
each step depends on information that has not been communicated to the
list, and there is a learning curve even in knowing what to include in an
email question. What I do think is that knowing enough basic R syntax to
express small bits of the problem in R will be a vast improvement over
attempting to use only English descriptions, and Rachel has to bridge that
initial gap.

For example, some images of data were apparently sent to Jim only, yet he
still does not know in what format the data file is stored, so that
technique was not very effective. One way for the question to become more
focused is for Rachel to study up on her own how to import data and provide
us with a "dput" (see the StackOverflow discussion I referenced before) of
a small sample of data. Another is for Rachel to use basic R syntax to
create an anonymous data set from scratch (also outlined in the SO
discussion). These approaches allow us to keep the focus of our mailing
list discussion on manipulating the data into summaries. Another approach
is to re-focus the question on importing data by supplying a download link
to the data so we can make suggestions as to what R commands will handle
this data in its raw form. In any case, we cannot leapfrog over the data to
the analysis as the question stands.

Given the above, I have to wonder why Rachel hasn't simply used the tool
she is familiar with... SPSS... to do this? If it is because this is an
academic assignment to learn R then she should be talking to her
institutional support (instructor/teaching assistant/tutoring staff) anyway
since there is a no-homework policy on this list (and that avenue would
have the benefit of being conducted orally and most likely in her native
language).


On January 6, 2019 1:12:46 AM PST, Jim Lemon  wrote:

Hi Rachel,
It looks to me as though the first thing you want to do is to get your
data, which you attach as images, into a data frame. If these are flat
files like CSV or TAB, you should be able to read them in with some
variant of the read.table function. If Excel, look at the various
Excel import packages. Then you can operate on the data frame by doing
things like tabulating Participant ID against the code for SMS or call
(which I assume are those 3000+ numbers). You can take the differences
in what look like POSIX time values between successive TRUE and FALSE
screen values to get the duration of screen activity and it looks like
participant activity is recorded at regular intervals. As Jeff
suggested, this is really just boring work figuring out how to extract
the events:

call_indices<-which(Probetype == xxCallLogProbe & ValueSpecified
== _id  & Valuedetailed ==3271)

using suitable logical statements and then tabulating them by
ParticipantID. If you know how to do that in SPSS, it won't be too
hard to translate the logical statements into R syntax as above. I may
have misunderstood the variable names, but I think the logic is clear.

Jim

On Sun, Jan 6, 2019 at 4:07 PM Rachel Thompson
 wrote:


Hi Jim,

Thank you for the clarification. Since I only work in SPSS and I am

>from Amsterdam I have had problems with specifying what I am trying to

do in this specific program and also in clear English language.


I think I want to indeed aggregate these events for each subject over

the observation. But in this case several observations.

1. I want to have a summary of how many times a specific subject got

called (CallLogProbe)

2. I want to have a summary of how many times a specific subject got

a text message (SMS probe)

3. I want to have a summary of how many times a specific subject
- Turned their screen on - True  (ScreenProbe)
- Or did not turn their screen on - False (ScreenProbe)
4.  I want to have a summary of the activity level of a specific

subject

- Activity level - none (ActivityProbe)
- Activity level- low (ActivityProbe)
- Activity level - 

Re: [R] Mailinglist

2019-01-06 Thread Rachel Thompson
Hi Jeff,

Thanks for your email.
I am an intern from Amsterdam and I have to do an analysis in R. I spoke to
my professor in Amsterdam and my supervisor's here in Boston. But they are
to busy to help. I informed them from the start that I am not familiar with
R(Rstudio) and they told me that I would receive guidance. So since they
can not help me, I decided to share my problem online.
(It is a CVS file imported into R)

Please understand that I am new to this. I will unsubscribe to the mailing
list if my question does not belong here.

Thanks,

Rachel

On Sun, Jan 6, 2019 at 11:01 AM Jeff Newmiller 
wrote:

> I would not want to leave the impression that I think the task at hand is
> merely tedious... my point is that there are numerous steps involved and
> each step depends on information that has not been communicated to the
> list, and there is a learning curve even in knowing what to include in an
> email question. What I do think is that knowing enough basic R syntax to
> express small bits of the problem in R will be a vast improvement over
> attempting to use only English descriptions, and Rachel has to bridge that
> initial gap.
>
> For example, some images of data were apparently sent to Jim only, yet he
> still does not know in what format the data file is stored, so that
> technique was not very effective. One way for the question to become more
> focused is for Rachel to study up on her own how to import data and provide
> us with a "dput" (see the StackOverflow discussion I referenced before) of
> a small sample of data. Another is for Rachel to use basic R syntax to
> create an anonymous data set from scratch (also outlined in the SO
> discussion). These approaches allow us to keep the focus of our mailing
> list discussion on manipulating the data into summaries. Another approach
> is to re-focus the question on importing data by supplying a download link
> to the data so we can make suggestions as to what R commands will handle
> this data in its raw form. In any case, we cannot leapfrog over the data to
> the analysis as the question stands.
>
> Given the above, I have to wonder why Rachel hasn't simply used the tool
> she is familiar with... SPSS... to do this? If it is because this is an
> academic assignment to learn R then she should be talking to her
> institutional support (instructor/teaching assistant/tutoring staff) anyway
> since there is a no-homework policy on this list (and that avenue would
> have the benefit of being conducted orally and most likely in her native
> language).
>
>
> On January 6, 2019 1:12:46 AM PST, Jim Lemon  wrote:
> >Hi Rachel,
> >It looks to me as though the first thing you want to do is to get your
> >data, which you attach as images, into a data frame. If these are flat
> >files like CSV or TAB, you should be able to read them in with some
> >variant of the read.table function. If Excel, look at the various
> >Excel import packages. Then you can operate on the data frame by doing
> >things like tabulating Participant ID against the code for SMS or call
> >(which I assume are those 3000+ numbers). You can take the differences
> >in what look like POSIX time values between successive TRUE and FALSE
> >screen values to get the duration of screen activity and it looks like
> >participant activity is recorded at regular intervals. As Jeff
> >suggested, this is really just boring work figuring out how to extract
> >the events:
> >
> >call_indices<-which(Probetype == xxCallLogProbe & ValueSpecified
> >== _id  & Valuedetailed ==3271)
> >
> >using suitable logical statements and then tabulating them by
> >ParticipantID. If you know how to do that in SPSS, it won't be too
> >hard to translate the logical statements into R syntax as above. I may
> >have misunderstood the variable names, but I think the logic is clear.
> >
> >Jim
> >
> >On Sun, Jan 6, 2019 at 4:07 PM Rachel Thompson
> > wrote:
> >>
> >> Hi Jim,
> >>
> >> Thank you for the clarification. Since I only work in SPSS and I am
> >from Amsterdam I have had problems with specifying what I am trying to
> >do in this specific program and also in clear English language.
> >>
> >> I think I want to indeed aggregate these events for each subject over
> >the observation. But in this case several observations.
> >> 1. I want to have a summary of how many times a specific subject got
> >called (CallLogProbe)
> >> 2. I want to have a summary of how many times a specific subject got
> >a text message (SMS probe)
> >> 3. I want to have a summary of how many times a specific subject
> >> - Turned their screen on - True  (ScreenProbe)
> >> - Or did not turn their screen on - False (ScreenProbe)
> >> 4.  I want to have a summary of the activity level of a specific
> >subject
> >> - Activity level - none (ActivityProbe)
> >> - Activity level- low (ActivityProbe)
> >> - Activity level - High  (ActivityProbe)
> >>
> >> I want to do this for all the 36 subjects(Participants).
> >>
> >> In the end, I have to define 

Re: [R] Mailinglist

2019-01-06 Thread Rachel Thompson
Hi Jim,

Thank you for your email and information
It is a CVS file which I imported in Rstudio.
I will look into what you told me and see if I am able to figure it out.

Best,

Rachel


On Sun, Jan 6, 2019 at 4:12 AM Jim Lemon  wrote:

> Hi Rachel,
> It looks to me as though the first thing you want to do is to get your
> data, which you attach as images, into a data frame. If these are flat
> files like CSV or TAB, you should be able to read them in with some
> variant of the read.table function. If Excel, look at the various
> Excel import packages. Then you can operate on the data frame by doing
> things like tabulating Participant ID against the code for SMS or call
> (which I assume are those 3000+ numbers). You can take the differences
> in what look like POSIX time values between successive TRUE and FALSE
> screen values to get the duration of screen activity and it looks like
> participant activity is recorded at regular intervals. As Jeff
> suggested, this is really just boring work figuring out how to extract
> the events:
>
> call_indices<-which(Probetype == xxCallLogProbe & ValueSpecified
> == _id  & Valuedetailed ==3271)
>
> using suitable logical statements and then tabulating them by
> ParticipantID. If you know how to do that in SPSS, it won't be too
> hard to translate the logical statements into R syntax as above. I may
> have misunderstood the variable names, but I think the logic is clear.
>
> Jim
>
> On Sun, Jan 6, 2019 at 4:07 PM Rachel Thompson
>  wrote:
> >
> > Hi Jim,
> >
> > Thank you for the clarification. Since I only work in SPSS and I am from
> Amsterdam I have had problems with specifying what I am trying to do in
> this specific program and also in clear English language.
> >
> > I think I want to indeed aggregate these events for each subject over
> the observation. But in this case several observations.
> > 1. I want to have a summary of how many times a specific subject got
> called (CallLogProbe)
> > 2. I want to have a summary of how many times a specific subject got a
> text message (SMS probe)
> > 3. I want to have a summary of how many times a specific subject
> > - Turned their screen on - True  (ScreenProbe)
> > - Or did not turn their screen on - False (ScreenProbe)
> > 4.  I want to have a summary of the activity level of a specific subject
> > - Activity level - none (ActivityProbe)
> > - Activity level- low (ActivityProbe)
> > - Activity level - High  (ActivityProbe)
> >
> > I want to do this for all the 36 subjects(Participants).
> >
> > In the end, I have to define percentages, so I am able to say...Subject
> 36 has low social interactions ( because they only got called and texted
> 500 times in total, while the average of all the participants is 1 or
> something). I have to come up with the percentages myself and define cutoff
> points of what is considered low-medium-high, based on what the results of
> all the subjects are.
> >
> > I hope that I am as clear as possible .
> >
> >
> > I feel as if I am on my way of understanding it, but since I do not
> clearly know, I am trying out a lot of different codes etc. and I do not
> know if I am doing the right thing. I indeed made a new data frame etc, but
> I still feel a bit lost. Do I need to make one per subject or per Probe
> etc..
> >
> >
> > Thanks for your help. I hope that you can help me resolve this issue.
> >
> >
> > Best,
> >
> >
> > Rachel
> >
> >
> >
> >
> >
> >
> > On Sat, Jan 5, 2019 at 9:03 PM Jim Lemon  wrote:
> >>
> >> Hi Rachel,
> >> I'll take a guess and assume that you are monitoring the mobile phones
> >> of 36 people, adding an observation every time some specified change
> >> of state is sensed on each device. I'll also assume that you are only
> >> recording four types of measurement. It seems that you want to
> >> aggregate these events for each subject over the interval or
> >> observation (or over each day or something). I think you are going to
> >> create a new data frame of these summaries from the one you have of
> >> individual observations. Creating each summary doesn't look too hard,
> >> but you will have to define more precisely what you want those
> >> summaries to be. For instance, "I want the mean activity level for
> >> each subject during the overall time that their mobile phone is
> >> switched on", One you have clearly defined your goals, it probably
> >> won't be too hard to get to them.
> >>
> >> Jim
> >>
> >> On Sun, Jan 6, 2019 at 5:39 AM Rachel Thompson
> >>  wrote:
> >> >
> >> > Dear Mr/Mrs,
> >> >
> >> > This is my first time working in R studio.
> >> > I have a database of 36 participants but it has 150600 entries.
> >> > Column - Column - Column- Column
> >> >
> >> > Participant   Activityprobe - Activity Level  - High/low/none
> >> >
> >> > Participant   Screenprobe - screenon/off -
> >> >
> >> > Participant   SMSprobe etc
> >> >
> >> > Participant   CallLogProbe etc.
> >> >
> >> > I need a code that helps 

Re: [R] Mailinglist

2019-01-06 Thread Jeff Newmiller
I would not want to leave the impression that I think the task at hand is 
merely tedious... my point is that there are numerous steps involved and each 
step depends on information that has not been communicated to the list, and 
there is a learning curve even in knowing what to include in an email question. 
What I do think is that knowing enough basic R syntax to express small bits of 
the problem in R will be a vast improvement over attempting to use only English 
descriptions, and Rachel has to bridge that initial gap.

For example, some images of data were apparently sent to Jim only, yet he still 
does not know in what format the data file is stored, so that technique was not 
very effective. One way for the question to become more focused is for Rachel 
to study up on her own how to import data and provide us with a "dput" (see the 
StackOverflow discussion I referenced before) of a small sample of data. 
Another is for Rachel to use basic R syntax to create an anonymous data set 
from scratch (also outlined in the SO discussion). These approaches allow us to 
keep the focus of our mailing list discussion on manipulating the data into 
summaries. Another approach is to re-focus the question on importing data by 
supplying a download link to the data so we can make suggestions as to what R 
commands will handle this data in its raw form. In any case, we cannot leapfrog 
over the data to the analysis as the question stands.

Given the above, I have to wonder why Rachel hasn't simply used the tool she is 
familiar with... SPSS... to do this? If it is because this is an academic 
assignment to learn R then she should be talking to her institutional support 
(instructor/teaching assistant/tutoring staff) anyway since there is a 
no-homework policy on this list (and that avenue would have the benefit of 
being conducted orally and most likely in her native language).


On January 6, 2019 1:12:46 AM PST, Jim Lemon  wrote:
>Hi Rachel,
>It looks to me as though the first thing you want to do is to get your
>data, which you attach as images, into a data frame. If these are flat
>files like CSV or TAB, you should be able to read them in with some
>variant of the read.table function. If Excel, look at the various
>Excel import packages. Then you can operate on the data frame by doing
>things like tabulating Participant ID against the code for SMS or call
>(which I assume are those 3000+ numbers). You can take the differences
>in what look like POSIX time values between successive TRUE and FALSE
>screen values to get the duration of screen activity and it looks like
>participant activity is recorded at regular intervals. As Jeff
>suggested, this is really just boring work figuring out how to extract
>the events:
>
>call_indices<-which(Probetype == xxCallLogProbe & ValueSpecified
>== _id  & Valuedetailed ==3271)
>
>using suitable logical statements and then tabulating them by
>ParticipantID. If you know how to do that in SPSS, it won't be too
>hard to translate the logical statements into R syntax as above. I may
>have misunderstood the variable names, but I think the logic is clear.
>
>Jim
>
>On Sun, Jan 6, 2019 at 4:07 PM Rachel Thompson
> wrote:
>>
>> Hi Jim,
>>
>> Thank you for the clarification. Since I only work in SPSS and I am
>from Amsterdam I have had problems with specifying what I am trying to
>do in this specific program and also in clear English language.
>>
>> I think I want to indeed aggregate these events for each subject over
>the observation. But in this case several observations.
>> 1. I want to have a summary of how many times a specific subject got
>called (CallLogProbe)
>> 2. I want to have a summary of how many times a specific subject got
>a text message (SMS probe)
>> 3. I want to have a summary of how many times a specific subject
>> - Turned their screen on - True  (ScreenProbe)
>> - Or did not turn their screen on - False (ScreenProbe)
>> 4.  I want to have a summary of the activity level of a specific
>subject
>> - Activity level - none (ActivityProbe)
>> - Activity level- low (ActivityProbe)
>> - Activity level - High  (ActivityProbe)
>>
>> I want to do this for all the 36 subjects(Participants).
>>
>> In the end, I have to define percentages, so I am able to
>say...Subject 36 has low social interactions ( because they only got
>called and texted 500 times in total, while the average of all the
>participants is 1 or something). I have to come up with the
>percentages myself and define cutoff points of what is considered
>low-medium-high, based on what the results of all the subjects are.
>>
>> I hope that I am as clear as possible .
>>
>>
>> I feel as if I am on my way of understanding it, but since I do not
>clearly know, I am trying out a lot of different codes etc. and I do
>not know if I am doing the right thing. I indeed made a new data frame
>etc, but I still feel a bit lost. Do I need to make one per subject or
>per Probe etc..
>>
>>
>> Thanks for your help. I 

Re: [R] data frame transformation

2019-01-06 Thread K. Elo
Hi!

Maybe this would do the trick:

--- snip ---

library(reshape2) # Use 'reshape2'
library(dplyr)# Use 'dplyr'

datatransfer<-data %>% mutate(letter2=letter) %>% 
  dcast(id+letter~letter2, value.var="weight")

--- snip ---

Or did I misunderstood something?

Best,

Kimmo

2019-01-06, 13:16 +, Andras Farkas via R-help wrote:
> Hello Everyone,
> 
> would you be able to assist with some expertise on how to get the
> following done in a way that can be applied to a data set with
> different dimensions and without all the line items here?
> 
> we have:
> 
> id<-c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may
> differ of course in real data set, usually in magnitude of 1
> letter<-
> c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),s
> ample(c("A","B","C","D","E"),2),
>  
> sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#nu
> mber of unique "letters" is less than 4000 in real data set and they
> are no duplicates within same ID
> weight<-c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2),
>   sample(c(1:30),4),sample(c(1:30),4))#number of unique
> weights is below 50 in real data set and they are no duplicates
> within same ID
> 
> 
> data<-data.frame(id=id,letter=letter,weight=weight)
> 
> #goal is to get the following transformation where a column is added
> for each unique letter and the weight is pulled into the column if
> the letter exist within the ID, otherwise NA
> #so we would get datatransform like below but without the many steps
> described here
> 
> datatransfer<-data.frame(data,apply(data[2],2,function(x)
> ifelse(x=="A",data$weight,NA)))
> datatransfer<-
> data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="B",data$weight,NA)))
> datatransfer<-
> data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="C",data$weight,NA)))
> datatransfer<-
> data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="D",data$weight,NA)))
> datatransfer<-
> data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="E",data$weight,NA)))
> 
> colnames(datatransfer)<-c("id","weight","letter","A","B","C","D","E")
> much appreciate the help,
> 
> thanks
> 
> Andras 
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data frame transformation

2019-01-06 Thread Andras Farkas via R-help
Hello Everyone,

would you be able to assist with some expertise on how to get the following 
done in a way that can be applied to a data set with different dimensions and 
without all the line items here?

we have:

id<-c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may differ of 
course in real data set, usually in magnitude of 1
letter<-c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),2),
          
sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#number of 
unique "letters" is less than 4000 in real data set and they are no duplicates 
within same ID
weight<-c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2),
          sample(c(1:30),4),sample(c(1:30),4))#number of unique weights is 
below 50 in real data set and they are no duplicates within same ID


data<-data.frame(id=id,letter=letter,weight=weight)

#goal is to get the following transformation where a column is added for each 
unique letter and the weight is pulled into the column if the letter exist 
within the ID, otherwise NA
#so we would get datatransform like below but without the many steps described 
here

datatransfer<-data.frame(data,apply(data[2],2,function(x) 
ifelse(x=="A",data$weight,NA)))
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) 
ifelse(x=="B",data$weight,NA)))
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) 
ifelse(x=="C",data$weight,NA)))
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) 
ifelse(x=="D",data$weight,NA)))
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) 
ifelse(x=="E",data$weight,NA)))

colnames(datatransfer)<-c("id","weight","letter","A","B","C","D","E")
much appreciate the help,

thanks

Andras 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mailinglist

2019-01-06 Thread Jim Lemon
Hi Rachel,
It looks to me as though the first thing you want to do is to get your
data, which you attach as images, into a data frame. If these are flat
files like CSV or TAB, you should be able to read them in with some
variant of the read.table function. If Excel, look at the various
Excel import packages. Then you can operate on the data frame by doing
things like tabulating Participant ID against the code for SMS or call
(which I assume are those 3000+ numbers). You can take the differences
in what look like POSIX time values between successive TRUE and FALSE
screen values to get the duration of screen activity and it looks like
participant activity is recorded at regular intervals. As Jeff
suggested, this is really just boring work figuring out how to extract
the events:

call_indices<-which(Probetype == xxCallLogProbe & ValueSpecified
== _id  & Valuedetailed ==3271)

using suitable logical statements and then tabulating them by
ParticipantID. If you know how to do that in SPSS, it won't be too
hard to translate the logical statements into R syntax as above. I may
have misunderstood the variable names, but I think the logic is clear.

Jim

On Sun, Jan 6, 2019 at 4:07 PM Rachel Thompson
 wrote:
>
> Hi Jim,
>
> Thank you for the clarification. Since I only work in SPSS and I am from 
> Amsterdam I have had problems with specifying what I am trying to do in this 
> specific program and also in clear English language.
>
> I think I want to indeed aggregate these events for each subject over the 
> observation. But in this case several observations.
> 1. I want to have a summary of how many times a specific subject got called 
> (CallLogProbe)
> 2. I want to have a summary of how many times a specific subject got a text 
> message (SMS probe)
> 3. I want to have a summary of how many times a specific subject
> - Turned their screen on - True  (ScreenProbe)
> - Or did not turn their screen on - False (ScreenProbe)
> 4.  I want to have a summary of the activity level of a specific subject
> - Activity level - none (ActivityProbe)
> - Activity level- low (ActivityProbe)
> - Activity level - High  (ActivityProbe)
>
> I want to do this for all the 36 subjects(Participants).
>
> In the end, I have to define percentages, so I am able to say...Subject 36 
> has low social interactions ( because they only got called and texted 500 
> times in total, while the average of all the participants is 1 or 
> something). I have to come up with the percentages myself and define cutoff 
> points of what is considered low-medium-high, based on what the results of 
> all the subjects are.
>
> I hope that I am as clear as possible .
>
>
> I feel as if I am on my way of understanding it, but since I do not clearly 
> know, I am trying out a lot of different codes etc. and I do not know if I am 
> doing the right thing. I indeed made a new data frame etc, but I still feel a 
> bit lost. Do I need to make one per subject or per Probe etc..
>
>
> Thanks for your help. I hope that you can help me resolve this issue.
>
>
> Best,
>
>
> Rachel
>
>
>
>
>
>
> On Sat, Jan 5, 2019 at 9:03 PM Jim Lemon  wrote:
>>
>> Hi Rachel,
>> I'll take a guess and assume that you are monitoring the mobile phones
>> of 36 people, adding an observation every time some specified change
>> of state is sensed on each device. I'll also assume that you are only
>> recording four types of measurement. It seems that you want to
>> aggregate these events for each subject over the interval or
>> observation (or over each day or something). I think you are going to
>> create a new data frame of these summaries from the one you have of
>> individual observations. Creating each summary doesn't look too hard,
>> but you will have to define more precisely what you want those
>> summaries to be. For instance, "I want the mean activity level for
>> each subject during the overall time that their mobile phone is
>> switched on", One you have clearly defined your goals, it probably
>> won't be too hard to get to them.
>>
>> Jim
>>
>> On Sun, Jan 6, 2019 at 5:39 AM Rachel Thompson
>>  wrote:
>> >
>> > Dear Mr/Mrs,
>> >
>> > This is my first time working in R studio.
>> > I have a database of 36 participants but it has 150600 entries.
>> > Column - Column - Column- Column
>> >
>> > Participant   Activityprobe - Activity Level  - High/low/none
>> >
>> > Participant   Screenprobe - screenon/off -
>> >
>> > Participant   SMSprobe etc
>> >
>> > Participant   CallLogProbe etc.
>> >
>> > I need a code that helps me count the activity level of all the 
>> > participants
>> > High activity level. No activity level and Low activity level.
>> > And to help me find out for every participant what the percentages are of
>> > all their high/no/low activity.
>> >
>> > For screenprobe I need to count how many times the participant turned their
>> > screen on and how many times they turned it off and the percentage of
>> > screen on/off.