Re: [R] Stringr / Regular Expressions advice

2014-07-01 Thread VINCENT DEAN BOYCE
Sara,

Yes, I modified the code that you provided and it worked quite well. Here
is the revised code:

.

accel_data - data
*# pattern to be identified*
v.to.match - c(438, 454, 459)
# call the below function anytime the v.to.match criteria changes to
ensure match is updated
v.matches - apply(fakedata, 1, function(x)all(x == v.to.match))
which(v.matches)
[1] 405
sum(v.matches)
[1] 1

..

Again, here is the dataset:

 dput(head(accel_data, 20))

structure(list(x_reading = c(455L, 451L, 458L, 463L, 462L, 460L,
448L, 449L, 450L, 451L, 445L, 440L, 439L, 445L, 448L, 447L, 440L,
439L, 440L, 434L), y_reading = c(502L, 503L, 502L, 502L, 495L,
505L, 480L, 483L, 489L, 488L, 489L, 456L, 497L, 476L, 470L, 474L,
469L, 482L, 484L, 477L), z_reading = c(454L, 454L, 452L, 452L,
446L, 459L, 456L, 451L, 451L, 455L, 438L, 462L, 437L, 455L, 470L,
455L, 460L, 463L, 458L, 458L)), .Names = c(x_reading, y_reading,
z_reading), row.names = c(NA, 20L), class = data.frame)

My next goal is to extend the range for each column. For instance:

v.to.match - c(438:445, 454:460, 459:470)

Your thoughts?

Many thanks,

Vincent





On Fri, Jun 27, 2014 at 5:51 AM, Sarah Goslee sarah.gos...@gmail.com
wrote:

 Hi,

 It's a good idea to copy back to the list, not just to mo, to keep the
 discussion all in one place.


 On Thursday, June 26, 2014, VINCENT DEAN BOYCE vincentdeanbo...@gmail.com
 wrote:

 Sarah,

 Great feedback and direction. Here is the data I am working with*:

  dput(head(data_log, 20))

 structure(list(x_reading = c(455L, 451L, 458L, 463L, 462L, 460L,
 448L, 449L, 450L, 451L, 445L, 440L, 439L, 445L, 448L, 447L, 440L,
 439L, 440L, 434L), y_reading = c(502L, 503L, 502L, 502L, 495L,
 505L, 480L, 483L, 489L, 488L, 489L, 456L, 497L, 476L, 470L, 474L,
 469L, 482L, 484L, 477L), z_reading = c(454L, 454L, 452L, 452L,
 446L, 459L, 456L, 451L, 451L, 455L, 438L, 462L, 437L, 455L, 470L,
 455L, 460L, 463L, 458L, 458L)), .Names = c(x_reading, y_reading,
 z_reading), row.names = c(NA, 20L), class = data.frame)

 *however, I am unsure why the letter L has been appended to each
 numerical string.


 It denotes values stored as integers, and is nothing you need to worry
 about.


 In any event, as you can see there are three columns of data named
 x_reading, y_reading and z_reading. I would like to detect patterns among
 them.

 For instance, let's say the pattern I wish to detect is 455, 502, 454
 across the three columns respectively. As you can see in the data, this is
 found in the first row.This particular string reoccurs numerous times
 within the dataset is what I wish to quantify - how many times the string
 455, 502, 454 appears.

 Your thoughts?


 Did you try the code I provided? It does what I think you're looking for.

 Sarah


 Many thanks,

 Vincent


 On Thu, Jun 26, 2014 at 4:46 PM, Sarah Goslee sarah.gos...@gmail.com
 wrote:

 Hi,

 On Thu, Jun 26, 2014 at 12:17 PM, VINCENT DEAN BOYCE
 vincentdeanbo...@gmail.com wrote:
  Hello,
 
  Using R,  I've loaded a .cvs file comprised of several hundred rows
 and 3
  columns of data. The data within maps the output of a triaxial
  accelerometer, a sensor which measures an object's acceleration along
 the
  x,y and z axes. The data for each respective column sequentially
  oscillates, and ranges numerically from 100 to 500.

 If your data are numeric, why are you using stringr?

 It would be easier to provide you with an answer if we knew what your
 data looked like.

 dput(head(yourdata, 20))

 and paste that into your non-HTML email.

  I want create a function that parses the data and detects patterns
 across
  the three columns.
 
  For instance, I would like to detect instances when the values for the
 x,y
  and z columns equal 150, 200, 300 respectively. Additionally, when a
 match
  is detected, I would like to know how many times the pattern appears.

 That's easy enough:

 fakedata - data.frame(matrix(c(
 100, 100, 200,
 150, 200, 300,
 100, 350, 100,
 400, 200, 300,
 200, 500, 200,
 150, 200, 300,
 150, 200, 300),
 ncol=3, byrow=TRUE))

 v.to.match - c(150, 200, 300)

 v.matches - apply(fakedata, 1, function(x)all(x == v.to.match))

 # which rows match
 which(v.matches)

 # how many rows match
 sum(v.matches)

  I have been successful using str_detect to provide a Boolean, however
 it
  seems to only work on a single vector, i.e, 400 , not a range of
 values
  i.e 400 - 450. See below:

 This is where I get confused, and where we need sample data. Are your
 data numeric, as you state above, or some other format?

 If your data are character, and like 400 - 450, you can still match
 them with the code I suggested above.

  # this works
  vals - str_detect (string = data_log$x_reading, pattern = 400)
 
  # this also works, but doesn't detect the particular range, rather the
  existence of the numbers
  vals - str_detect (string = data_log$x_reading, pattern =
 [400-450])

 Are you trying to match any numeric value in the range 400-450? Again,
 actual data.

  Also, it 

Re: [R] Stringr / Regular Expressions advice

2014-07-01 Thread arun
#or 

res - mapply(`%in%`, accel_data, v.to.match)

res1 - sapply(seq_len(ncol(accel_data)),function(i) 
accel_data[i]=tail(v.to.match[[i]],1)  accel_data[i] =v.to.match[[i]][1])

all.equal(res, res1,check.attributes=F)
#[1] TRUE

A.K.

On Tuesday, July 1, 2014 10:56 PM, arun smartpink...@yahoo.com wrote:
Hi Vincent,

You could try:
v.to.match - list(438:445, 454:460,459:470)

sapply(seq_len(ncol(accel_data)),function(i) 
accel_data[i]=tail(v.to.match[[i]],1)  accel_data[i] =v.to.match[[i]][1])

#or use ?cut or ?findInterval

A.K.







On Tuesday, July 1, 2014 2:23 PM, VINCENT DEAN BOYCE 
vincentdeanbo...@gmail.com wrote:
Sara,

Yes, I modified the code that you provided and it worked quite well. Here
is the revised code:

.

accel_data - data
*# pattern to be identified*
v.to.match - c(438, 454, 459)
# call the below function anytime the v.to.match criteria changes to
ensure match is updated
v.matches - apply(fakedata, 1, function(x)all(x == v.to.match))
which(v.matches)
[1] 405
sum(v.matches)
[1] 1

..

Again, here is the dataset:

 dput(head(accel_data, 20))

structure(list(x_reading = c(455L, 451L, 458L, 463L, 462L, 460L,
448L, 449L, 450L, 451L, 445L, 440L, 439L, 445L, 448L, 447L, 440L,
439L, 440L, 434L), y_reading = c(502L, 503L, 502L, 502L, 495L,
505L, 480L, 483L, 489L, 488L, 489L, 456L, 497L, 476L, 470L, 474L,
469L, 482L, 484L, 477L), z_reading = c(454L, 454L, 452L, 452L,
446L, 459L, 456L, 451L, 451L, 455L, 438L, 462L, 437L, 455L, 470L,
455L, 460L, 463L, 458L, 458L)), .Names = c(x_reading, y_reading,
z_reading), row.names = c(NA, 20L), class = data.frame)

My next goal is to extend the range for each column. For instance:

v.to.match - c(438:445, 454:460, 459:470)

Your thoughts?

Many thanks,

Vincent








On Fri, Jun 27, 2014 at 5:51 AM, Sarah Goslee sarah.gos...@gmail.com
wrote:

 Hi,

 It's a good idea to copy back to the list, not just to mo, to keep the
 discussion all in one place.


 On Thursday, June 26, 2014, VINCENT DEAN BOYCE vincentdeanbo...@gmail.com
 wrote:

 Sarah,

 Great feedback and direction. Here is the data I am working with*:

  dput(head(data_log, 20))

 structure(list(x_reading = c(455L, 451L, 458L, 463L, 462L, 460L,
 448L, 449L, 450L, 451L, 445L, 440L, 439L, 445L, 448L, 447L, 440L,
 439L, 440L, 434L), y_reading = c(502L, 503L, 502L, 502L, 495L,
 505L, 480L, 483L, 489L, 488L, 489L, 456L, 497L, 476L, 470L, 474L,
 469L, 482L, 484L, 477L), z_reading = c(454L, 454L, 452L, 452L,
 446L, 459L, 456L, 451L, 451L, 455L, 438L, 462L, 437L, 455L, 470L,
 455L, 460L, 463L, 458L, 458L)), .Names = c(x_reading, y_reading,
 z_reading), row.names = c(NA, 20L), class = data.frame)

 *however, I am unsure why the letter L has been appended to each
 numerical string.


 It denotes values stored as integers, and is nothing you need to worry
 about.


 In any event, as you can see there are three columns of data named
 x_reading, y_reading and z_reading. I would like to detect patterns among
 them.

 For instance, let's say the pattern I wish to detect is 455, 502, 454
 across the three columns respectively. As you can see in the data, this is
 found in the first row.This particular string reoccurs numerous times
 within the dataset is what I wish to quantify - how many times the string
 455, 502, 454 appears.

 Your thoughts?


 Did you try the code I provided? It does what I think you're looking for.

 Sarah


 Many thanks,

 Vincent


 On Thu, Jun 26, 2014 at 4:46 PM, Sarah Goslee sarah.gos...@gmail.com
 wrote:

 Hi,

 On Thu, Jun 26, 2014 at 12:17 PM, VINCENT DEAN BOYCE
 vincentdeanbo...@gmail.com wrote:
  Hello,
 
  Using R,  I've loaded a .cvs file comprised of several hundred rows
 and 3
  columns of data. The data within maps the output of a triaxial
  accelerometer, a sensor which measures an object's acceleration along
 the
  x,y and z axes. The data for each respective column sequentially
  oscillates, and ranges numerically from 100 to 500.

 If your data are numeric, why are you using stringr?

 It would be easier to provide you with an answer if we knew what your
 data looked like.

 dput(head(yourdata, 20))

 and paste that into your non-HTML email.

  I want create a function that parses the data and detects patterns
 across
  the three columns.
 
  For instance, I would like to detect instances when the values for the
 x,y
  and z columns equal 150, 200, 300 respectively. Additionally, when a
 match
  is detected, I would like to know how many times the pattern appears.

 That's easy enough:

 fakedata - data.frame(matrix(c(
 100, 100, 200,
 150, 200, 300,
 100, 350, 100,
 400, 200, 300,
 200, 500, 200,
 150, 200, 300,
 150, 200, 300),
 ncol=3, byrow=TRUE))

 v.to.match - c(150, 200, 300)

 v.matches - apply(fakedata, 1, function(x)all(x == v.to.match))

 # which rows match
 which(v.matches)

 # how many rows match
 sum(v.matches)

  I have been successful using str_detect to provide a Boolean, however
 it
  seems to only work on a single vector, i.e, 400 , not a range 

Re: [R] Stringr / Regular Expressions advice

2014-06-27 Thread Sarah Goslee
Hi,

It's a good idea to copy back to the list, not just to mo, to keep the
discussion all in one place.

On Thursday, June 26, 2014, VINCENT DEAN BOYCE vincentdeanbo...@gmail.com
wrote:

 Sarah,

 Great feedback and direction. Here is the data I am working with*:

  dput(head(data_log, 20))

 structure(list(x_reading = c(455L, 451L, 458L, 463L, 462L, 460L,
 448L, 449L, 450L, 451L, 445L, 440L, 439L, 445L, 448L, 447L, 440L,
 439L, 440L, 434L), y_reading = c(502L, 503L, 502L, 502L, 495L,
 505L, 480L, 483L, 489L, 488L, 489L, 456L, 497L, 476L, 470L, 474L,
 469L, 482L, 484L, 477L), z_reading = c(454L, 454L, 452L, 452L,
 446L, 459L, 456L, 451L, 451L, 455L, 438L, 462L, 437L, 455L, 470L,
 455L, 460L, 463L, 458L, 458L)), .Names = c(x_reading, y_reading,
 z_reading), row.names = c(NA, 20L), class = data.frame)

 *however, I am unsure why the letter L has been appended to each
 numerical string.


It denotes values stored as integers, and is nothing you need to worry
about.


 In any event, as you can see there are three columns of data named
 x_reading, y_reading and z_reading. I would like to detect patterns among
 them.

 For instance, let's say the pattern I wish to detect is 455, 502, 454
 across the three columns respectively. As you can see in the data, this is
 found in the first row.This particular string reoccurs numerous times
 within the dataset is what I wish to quantify - how many times the string
 455, 502, 454 appears.

 Your thoughts?


Did you try the code I provided? It does what I think you're looking for.

Sarah


 Many thanks,

 Vincent


 On Thu, Jun 26, 2014 at 4:46 PM, Sarah Goslee sarah.gos...@gmail.com
 javascript:_e(%7B%7D,'cvml','sarah.gos...@gmail.com'); wrote:

 Hi,

 On Thu, Jun 26, 2014 at 12:17 PM, VINCENT DEAN BOYCE
 vincentdeanbo...@gmail.com
 javascript:_e(%7B%7D,'cvml','vincentdeanbo...@gmail.com'); wrote:
  Hello,
 
  Using R,  I've loaded a .cvs file comprised of several hundred rows and
 3
  columns of data. The data within maps the output of a triaxial
  accelerometer, a sensor which measures an object's acceleration along
 the
  x,y and z axes. The data for each respective column sequentially
  oscillates, and ranges numerically from 100 to 500.

 If your data are numeric, why are you using stringr?

 It would be easier to provide you with an answer if we knew what your
 data looked like.

 dput(head(yourdata, 20))

 and paste that into your non-HTML email.

  I want create a function that parses the data and detects patterns
 across
  the three columns.
 
  For instance, I would like to detect instances when the values for the
 x,y
  and z columns equal 150, 200, 300 respectively. Additionally, when a
 match
  is detected, I would like to know how many times the pattern appears.

 That's easy enough:

 fakedata - data.frame(matrix(c(
 100, 100, 200,
 150, 200, 300,
 100, 350, 100,
 400, 200, 300,
 200, 500, 200,
 150, 200, 300,
 150, 200, 300),
 ncol=3, byrow=TRUE))

 v.to.match - c(150, 200, 300)

 v.matches - apply(fakedata, 1, function(x)all(x == v.to.match))

 # which rows match
 which(v.matches)

 # how many rows match
 sum(v.matches)

  I have been successful using str_detect to provide a Boolean, however it
  seems to only work on a single vector, i.e, 400 , not a range of
 values
  i.e 400 - 450. See below:

 This is where I get confused, and where we need sample data. Are your
 data numeric, as you state above, or some other format?

 If your data are character, and like 400 - 450, you can still match
 them with the code I suggested above.

  # this works
  vals - str_detect (string = data_log$x_reading, pattern = 400)
 
  # this also works, but doesn't detect the particular range, rather the
  existence of the numbers
  vals - str_detect (string = data_log$x_reading, pattern = [400-450])

 Are you trying to match any numeric value in the range 400-450? Again,
 actual data.

  Also, it appears that I can only apply it to a single column, not to all
  three columns. However I may be mistaken.

 You answer your own question unwittingly - apply().

 Sarah

 --
 Sarah Goslee
 http://www.functionaldiversity.org




-- 
Sarah Goslee
http://www.stringpage.com
http://www.sarahgoslee.com
http://www.functionaldiversity.org

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Stringr / Regular Expressions advice

2014-06-26 Thread VINCENT DEAN BOYCE
Hello,

Using R,  I've loaded a .cvs file comprised of several hundred rows and 3
columns of data. The data within maps the output of a triaxial
accelerometer, a sensor which measures an object's acceleration along the
x,y and z axes. The data for each respective column sequentially
oscillates, and ranges numerically from 100 to 500.

I want create a function that parses the data and detects patterns across
the three columns.

For instance, I would like to detect instances when the values for the x,y
and z columns equal 150, 200, 300 respectively. Additionally, when a match
is detected, I would like to know how many times the pattern appears.

I have been successful using str_detect to provide a Boolean, however it
seems to only work on a single vector, i.e, 400 , not a range of values
i.e 400 - 450. See below:


# this works
 vals - str_detect (string = data_log$x_reading, pattern = 400)

# this also works, but doesn't detect the particular range, rather the
existence of the numbers
 vals - str_detect (string = data_log$x_reading, pattern = [400-450])

Also, it appears that I can only apply it to a single column, not to all
three columns. However I may be mistaken.

Any advice on my current approach or alternativea I should consider is
greatly appreciated.

Many thanks,

Vincent

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Stringr / Regular Expressions advice

2014-06-26 Thread Adams, Jean
You could define a simple function to detect whether a value is within a
given range.  For example,

inrange - function(vec, range) {
!is.na(vec)  vec = range[1]  vec = range[2]
 }
x - 1:30
inrange(x, c(5, 20))

If you wanted to apply this function to all three columns at once, you
could use apply().  For example,
apply(data_log, 2, inrange)

Jean



On Thu, Jun 26, 2014 at 11:17 AM, VINCENT DEAN BOYCE 
vincentdeanbo...@gmail.com wrote:

 Hello,

 Using R,  I've loaded a .cvs file comprised of several hundred rows and 3
 columns of data. The data within maps the output of a triaxial
 accelerometer, a sensor which measures an object's acceleration along the
 x,y and z axes. The data for each respective column sequentially
 oscillates, and ranges numerically from 100 to 500.

 I want create a function that parses the data and detects patterns across
 the three columns.

 For instance, I would like to detect instances when the values for the x,y
 and z columns equal 150, 200, 300 respectively. Additionally, when a match
 is detected, I would like to know how many times the pattern appears.

 I have been successful using str_detect to provide a Boolean, however it
 seems to only work on a single vector, i.e, 400 , not a range of values
 i.e 400 - 450. See below:


 # this works
  vals - str_detect (string = data_log$x_reading, pattern = 400)

 # this also works, but doesn't detect the particular range, rather the
 existence of the numbers
  vals - str_detect (string = data_log$x_reading, pattern = [400-450])

 Also, it appears that I can only apply it to a single column, not to all
 three columns. However I may be mistaken.

 Any advice on my current approach or alternativea I should consider is
 greatly appreciated.

 Many thanks,

 Vincent

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Stringr / Regular Expressions advice

2014-06-26 Thread Sarah Goslee
Hi,

On Thu, Jun 26, 2014 at 12:17 PM, VINCENT DEAN BOYCE
vincentdeanbo...@gmail.com wrote:
 Hello,

 Using R,  I've loaded a .cvs file comprised of several hundred rows and 3
 columns of data. The data within maps the output of a triaxial
 accelerometer, a sensor which measures an object's acceleration along the
 x,y and z axes. The data for each respective column sequentially
 oscillates, and ranges numerically from 100 to 500.

If your data are numeric, why are you using stringr?

It would be easier to provide you with an answer if we knew what your
data looked like.

dput(head(yourdata, 20))

and paste that into your non-HTML email.

 I want create a function that parses the data and detects patterns across
 the three columns.

 For instance, I would like to detect instances when the values for the x,y
 and z columns equal 150, 200, 300 respectively. Additionally, when a match
 is detected, I would like to know how many times the pattern appears.

That's easy enough:

fakedata - data.frame(matrix(c(
100, 100, 200,
150, 200, 300,
100, 350, 100,
400, 200, 300,
200, 500, 200,
150, 200, 300,
150, 200, 300),
ncol=3, byrow=TRUE))

v.to.match - c(150, 200, 300)

v.matches - apply(fakedata, 1, function(x)all(x == v.to.match))

# which rows match
which(v.matches)

# how many rows match
sum(v.matches)

 I have been successful using str_detect to provide a Boolean, however it
 seems to only work on a single vector, i.e, 400 , not a range of values
 i.e 400 - 450. See below:

This is where I get confused, and where we need sample data. Are your
data numeric, as you state above, or some other format?

If your data are character, and like 400 - 450, you can still match
them with the code I suggested above.

 # this works
 vals - str_detect (string = data_log$x_reading, pattern = 400)

 # this also works, but doesn't detect the particular range, rather the
 existence of the numbers
 vals - str_detect (string = data_log$x_reading, pattern = [400-450])

Are you trying to match any numeric value in the range 400-450? Again,
actual data.

 Also, it appears that I can only apply it to a single column, not to all
 three columns. However I may be mistaken.

You answer your own question unwittingly - apply().

Sarah

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Stringr / Regular Expressions advice

2014-06-26 Thread arun


Hi,
May be you can use ?cut or ?findInterval for the range

dat1 - read.table(text=100, 100, 200
250, 300, 350
100, 350, 100
400, 250, 300
200, 450, 200
150, 501, 300
150, 250, 300,sep=,,header=F)
sapply(dat1, findInterval, c(400,500))==1
#    V1    V2    V3
#[1,] FALSE FALSE FALSE
#[2,] FALSE FALSE FALSE
#[3,] FALSE FALSE FALSE
#[4,]  TRUE FALSE FALSE
#[5,] FALSE  TRUE FALSE
#[6,] FALSE FALSE FALSE
#[7,] FALSE FALSE FALSE

A.K.



On Thursday, June 26, 2014 4:11 PM, VINCENT DEAN BOYCE 
vincentdeanbo...@gmail.com wrote:
Hello,

Using R,  I've loaded a .cvs file comprised of several hundred rows and 3
columns of data. The data within maps the output of a triaxial
accelerometer, a sensor which measures an object's acceleration along the
x,y and z axes. The data for each respective column sequentially
oscillates, and ranges numerically from 100 to 500.

I want create a function that parses the data and detects patterns across
the three columns.

For instance, I would like to detect instances when the values for the x,y
and z columns equal 150, 200, 300 respectively. Additionally, when a match
is detected, I would like to know how many times the pattern appears.

I have been successful using str_detect to provide a Boolean, however it
seems to only work on a single vector, i.e, 400 , not a range of values
i.e 400 - 450. See below:


# this works
 vals - str_detect (string = data_log$x_reading, pattern = 400)

# this also works, but doesn't detect the particular range, rather the
existence of the numbers
 vals - str_detect (string = data_log$x_reading, pattern = [400-450])

Also, it appears that I can only apply it to a single column, not to all
three columns. However I may be mistaken.

Any advice on my current approach or alternativea I should consider is
greatly appreciated.

Many thanks,

Vincent

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.