Re: [R] error = FALSE causes knit2wp to throw duplicate label error

2018-12-17 Thread Nathan Parsons
Thanks for getting me pointed in the right direction. If I happen upon
a satisfactory solution, I will report back!

Nate Parsons
Pronouns: He, Him, His
Graduate Teaching Assistant
Department of Sociology
Portland State University
Portland, Oregon

Schedule an appointment: https://calendly.com/nate-parsons

503-893-8281
503-725-3957 FAX


Nate Parsons
Pronouns: He, Him, His
Graduate Teaching Assistant
Department of Sociology
Portland State University
Portland, Oregon

Schedule an appointment: https://calendly.com/nate-parsons

503-893-8281
503-725-3957 FAX


On Sun, Dec 16, 2018 at 1:46 PM Jeff Newmiller  wrote:
>
> This seems a bit deep into knitr for R-help... you might have better luck on 
> StackExchange. I also suggest that posting an incomplete example is usually 
> the kiss of death for getting constructive assistance online.
>
> FWIW my guess is that executing knitr from within an Rmarkdown document is a 
> bad idea unless you are building using child documents. Try manipulating your 
> markdown from an R file.
>
> On December 16, 2018 11:48:44 AM PST, Nathan Parsons 
>  wrote:
> >Goal: post from R to Wordpress installation on server.
> >
> >Problem: R keeps returning the error “Error in parse_block(g[-1],
> >g[1], params.src) : duplicate label 'setup’” if error = FALSE in the
> >knitr options or in an r chunk. It works fine if error = TRUE. I could
> >just go through each post each time and remove any returned errors
> >manually, but I'd like to find a more permanent solution.
> >
> >I don't have any duplicate labels; is knit2wp somehow introducing a
> >duplicate label in the .Rmd
> >-> .md / upload process?
> >
> >My code:
> >
> >```{r setup, include=FALSE}
> >## Set the global chunk options for knitting reports
> >  knitr::opts_chunk$set(
> >echo = TRUE,
> >eval = TRUE,
> >message = TRUE,
> >error = FALSE,
> >warning = TRUE,
> >highlight = TRUE,
> >prompt = FALSE
> >  )
> >
> >## Load and activate libraries using 'pacman' package
> >  if (!require(pacman)) {
> >install.packages("pacman", repos = "http://cran.us.r-project.org;)
> >  require(pacman)
> >  }
> >
> >  pacman::p_load_gh("duncantl/XMLRPC",
> >"duncantl/RWordPress")
> >  pacman::p_load("knitr")
> >```
> >
> >```{r chunk1, echo = FALSE}
> >## post information
> >  fileName <- "fancy_post.Rmd"
> >  postTitle <- "Fancy Post Title"
> >
> >```
> >
> >blah blah blah...
> >
> >```{r chunk2, echo = FALSE}
> >## Set working directory to correct location
> >  last_dir <- getwd()
> >  setwd("~/Sites/posts")
> >
> >## Tell knitr to create the html code and upload it to your wordpress
> >site
> >  knit2wp(input = fileName,
> >title = postTitle,
> >publish = FALSE,
> >action = 'newPost')
> >
> >  setwd(last_dir)
> >```
> >
> >
> >Traceback:
> >Error in parse_block(g[-1], g[1], params.src) : duplicate label 'setup'
> >26. stop("duplicate label '", label, "'")
> >25. parse_block(g[-1], g[1], params.src)
> >24. FUN(X[[i]], ...)
> >23. lapply(groups, function(g) { block = grepl(chunk.begin, g[1]) if
> >(!set.preamble && !parent_mode()) { return(if (block) "" else g) ...
> >22. split_file(lines = text)
> >21. process_file(text, output)
> >20. knit(input, encoding = encoding, envir = envir)
> >19. knit2wp(input = fileName, title = postTitle, publish = FALSE,
> >action = "newPost")
> >18. eval(expr, envir, enclos)
> >17. eval(expr, envir, enclos)
> >16. withVisible(eval(expr, envir, enclos))
> >15. withCallingHandlers(withVisible(eval(expr, envir, enclos)),
> >warning = wHandler, error = eHandler, message = mHandler)
> >14. handle(ev <- withCallingHandlers(withVisible(eval(expr, envir,
> >enclos)), warning = wHandler, error = eHandler, message = mHandler))
> >13. timing_fn(handle(ev <- withCallingHandlers(withVisible(eval(expr,
> >envir, enclos)), warning = wHandler, error = eHandler, message =
> >mHandler)))
> >12. evaluate_call(expr, parsed$src[[i]], envir = envir, enclos =
> >enclos, debug = debug, last = i == length(out), use_try =
> >stop_on_error != 2L, keep_warning = keep_warning, keep_message =
> >keep_message, output_handler = output_handler, include_timing =
> >include_timing)
> >11. evaluate::evaluate(...)
> >10. evaluate(code, envir = env, new_device = FALSE, keep_warning =
> >!isFA

[R] error = FALSE causes knit2wp to throw duplicate label error

2018-12-16 Thread Nathan Parsons
Goal: post from R to Wordpress installation on server.

Problem: R keeps returning the error “Error in parse_block(g[-1],
g[1], params.src) : duplicate label 'setup’” if error = FALSE in the
knitr options or in an r chunk. It works fine if error = TRUE. I could
just go through each post each time and remove any returned errors
manually, but I'd like to find a more permanent solution.

I don't have any duplicate labels; is knit2wp somehow introducing a
duplicate label in the .Rmd
-> .md / upload process?

My code:

```{r setup, include=FALSE}
## Set the global chunk options for knitting reports
  knitr::opts_chunk$set(
echo = TRUE,
eval = TRUE,
message = TRUE,
error = FALSE,
warning = TRUE,
highlight = TRUE,
prompt = FALSE
  )

## Load and activate libraries using 'pacman' package
  if (!require(pacman)) {
install.packages("pacman", repos = "http://cran.us.r-project.org;)
  require(pacman)
  }

  pacman::p_load_gh("duncantl/XMLRPC",
"duncantl/RWordPress")
  pacman::p_load("knitr")
```

```{r chunk1, echo = FALSE}
## post information
  fileName <- "fancy_post.Rmd"
  postTitle <- "Fancy Post Title"

```

blah blah blah...

```{r chunk2, echo = FALSE}
## Set working directory to correct location
  last_dir <- getwd()
  setwd("~/Sites/posts")

## Tell knitr to create the html code and upload it to your wordpress site
  knit2wp(input = fileName,
title = postTitle,
publish = FALSE,
action = 'newPost')

  setwd(last_dir)
```


Traceback:
Error in parse_block(g[-1], g[1], params.src) : duplicate label 'setup'
26. stop("duplicate label '", label, "'")
25. parse_block(g[-1], g[1], params.src)
24. FUN(X[[i]], ...)
23. lapply(groups, function(g) { block = grepl(chunk.begin, g[1]) if
(!set.preamble && !parent_mode()) { return(if (block) "" else g) ...
22. split_file(lines = text)
21. process_file(text, output)
20. knit(input, encoding = encoding, envir = envir)
19. knit2wp(input = fileName, title = postTitle, publish = FALSE,
action = "newPost")
18. eval(expr, envir, enclos)
17. eval(expr, envir, enclos)
16. withVisible(eval(expr, envir, enclos))
15. withCallingHandlers(withVisible(eval(expr, envir, enclos)),
warning = wHandler, error = eHandler, message = mHandler)
14. handle(ev <- withCallingHandlers(withVisible(eval(expr, envir,
enclos)), warning = wHandler, error = eHandler, message = mHandler))
13. timing_fn(handle(ev <- withCallingHandlers(withVisible(eval(expr,
envir, enclos)), warning = wHandler, error = eHandler, message =
mHandler)))
12. evaluate_call(expr, parsed$src[[i]], envir = envir, enclos =
enclos, debug = debug, last = i == length(out), use_try =
stop_on_error != 2L, keep_warning = keep_warning, keep_message =
keep_message, output_handler = output_handler, include_timing =
include_timing)
11. evaluate::evaluate(...)
10. evaluate(code, envir = env, new_device = FALSE, keep_warning =
!isFALSE(options$warning), keep_message = !isFALSE(options$message),
stop_on_error = if (options$error && options$include) 0L else 2L,
output_handler = knit_handlers(options$render, options))
9. in_dir(input_dir(), evaluate(code, envir = env, new_device = FALSE,
keep_warning = !isFALSE(options$warning), keep_message =
!isFALSE(options$message), stop_on_error = if (options$error &&
options$include) 0L else 2L, output_handler =
knit_handlers(options$render, options)))
8. block_exec(params)
7. call_block(x)
6. process_group.block(group)
5. process_group(group)
4. withCallingHandlers(if (tangle) process_tangle(group) else
process_group(group), error = function(e) { setwd(wd) cat(res, sep =
"\n", file = output %n% "") ...
3. process_file(text, output)
2. knit(input, encoding = encoding, envir = envir)
1. knit2wp(input = fileName, title = postTitle, publish = FALSE,
action = "newPost")

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Matching multiple search criteria (Unlisting a nested dataset, take 2)

2018-10-17 Thread Nathan Parsons
-> c
ifelse(a == TRUE & b == TRUE & c == TRUE, TRUE, FALSE)
}

## Evaluate tweets for presence of search term
th %>%
mutate(flag = map_chr(text, srchr)) -> th_flagged

As far as I can tell, this works. I have to manually enter each set of search 
terms into the function, which is not ideal. Also, this only generates a 
True/False for each tweet based on one search term - I end up with an 
evaluatory column for each search term that I would then have to collapse 
together somehow. I’m sure there’s a more elegant solution.

--

Nate Parsons
Pronouns: He, Him, His
Graduate Teaching Assistant
Department of Sociology
Portland State University
Portland, Oregon

503-725-9025
503-725-3957 FAX
On Oct 16, 2018, 7:20 PM -0700, Bert Gunter , wrote:
> OK, as no one else has offered a solution, I'll take a whack at it.
>
> Caveats: This is a brute force attempt using R's basic regular expression 
> engine. It is inelegant and barely tested, so likely to be at best incomplete 
> and buggy, and at worst, incorrect. But maybe Nathan or someone else on the 
> list can fix it up. So if (when) it breaks, complain on the list to give 
> someone (almost certainly not me) the opportunity.
>
> The basic idea is that the tweets are just character strings and the search 
> phrases are just character vectors all of whose elements must match 
> "appropriately" -- i.e. they must match whole words -- in the character 
> strings. So my desired output from the code is a list indexed by the search 
> phrases, each of whose components if a logical vector of length the number of 
> tweets each of whose elements = TRUE iff all the words in the search phrase 
> match somewhere in the tweet.
>
> Here's the code(using the data Nathan provided):
>
> > words <- sapply(st[[1]],strsplit,split = " +" )
> ## convert the phrases to a list of character vectors of the words
> ## Result:
> > words
> $`me abused depressed`
> [1] "me"    "abused"    "depressed"
>
> $`me hurt depressed`
> [1] "me"    "hurt"  "depressed"
>
> $`feel hopeless depressed`
> [1] "feel"  "hopeless"  "depressed"
>
> $`feel alone depressed`
> [1] "feel"  "alone" "depressed"
>
> $`i feel helpless`
> [1] "i"    "feel" "helpless"
>
> $`i feel worthless`
> [1] "i" "feel"  "worthless"
>
> > expand.words <-  function(z)lapply(z,function(x)paste0(c("^ *"," "," "),x, 
> > c(" "," "," *$")))
> ## function to create regexes for words when they are at the beginning, 
> middle, or end of tweets
>
> > wordregex <- lapply(words,expand.words)
> ##Result
> ## too lengthy to include
> ##
> > tweets <- th$text
> ##extract the tweets
> > findin <- function(x,y)
>    ## x is a vector of regex patterns
>    ## y is a character vector
>    ## value = vector,vec, with length(vec) == length(y) and vec[i] == TRUE 
> iff any of x matches y[i]
> { apply(sapply(x,function(z)grepl(z,y)), 1,any)
> }
>
> ## add a matching "tweet" to the tweet vector:
> > tweets <- c(tweets," i  worthless yxxc ght feel")
>
> > ans <- 
> > lapply(wordregex,function(z)apply(sapply(z,function(x)findin(x,tweets)), 1, 
> > all))
> ## Result:
> > ans
> $`me abused depressed`
> [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
>
> $`me hurt depressed`
> [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
>
> $`feel hopeless depressed`
> [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
>
> $`feel alone depressed`
> [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
>
> $`i feel helpless`
> [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
>
> $`i feel worthless`
> [1] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE
>
> ## None of the tweets match any of the phrases except for the last tweet that 
> I added.
>
> ## Note: you need to add capabilities to handle upper and lower case. See, 
> e.g. ?casefold
>
> Cheers,
> Bert
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and 
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> > On Tue, Oct 16, 2018 at 3:03 PM Bert Gunter  wrote:
> > > The problem wasn't the data tibbles. You posted in html -- which you were 
> > > explictly warned against -- and that corrupted your text (e.g. some 
> > > quotes became "smart quotes", which cannot be properly cut and pasted 
> > > into

Re: [R] Matching multiple search criteria (Unlisting a nested dataset, take 2)

2018-10-16 Thread Nathan Parsons
Argh! Here are those two example datasets as data frames (not tibbles).
Sorry again. This apparently is just not my day.


th <- structure(list(status_id = c("x1047841705729306624",
"x1046966595610927105",

"x1047094786610552832", "x1046988542818308097", "x1046934493553221632",

"x1047227442899775488"), created_at = c("2018-10-04T13:31:45Z",

"2018-10-02T03:34:22Z", "2018-10-02T12:03:45Z", "2018-10-02T05:01:35Z",

"2018-10-02T01:26:49Z", "2018-10-02T20:50:53Z"), text = c("Technique is
everything with olympic lifts ! @ Body By John https://t.co/UsfR6DafZt;,

"@Subtronics just went back and rewatched ur FBlice with ur CDJs and let me
tell you man. You are the fucking messiah",

"@ic4rus1 Opportunistic means short-game. As in getting drunk now vs. not
being hung over tomorrow vs. not fucking up your life ten years later.",

"I tend to think about my dreams before I sleep.", "@MichaelAvenatti
@SenatorCollins So,  if your client was in her 20s, attending parties with
teenagers, doesn't that make her at the least immature as hell, or at the
worst, a pedophile and a person contributing to the delinquency of minors?",


"i wish i could take credit for this"), lat = c(43.6835853, 40.284123,

37.7706565, 40.431389, 31.1688935, 33.9376735), lng = c(-70.3284118,

-83.078589, -122.4359785, -79.9806895, -100.0768885, -118.130426

), county_name = c("Cumberland County", "Delaware County", "San Francisco
County",

"Allegheny County", "Concho County", "Los Angeles County"), fips = c(23005L,


39041L, 6075L, 42003L, 48095L, 6037L), state_name = c("Maine",

"Ohio", "California", "Pennsylvania", "Texas", "California"),

state_abb = c("ME", "OH", "CA", "PA", "TX", "CA"), urban_level =
c("Medium Metro",

"Large Fringe Metro", "Large Central Metro", "Large Central Metro",

"NonCore (Nonmetro)", "Large Central Metro"), urban_code = c(3L,

    2L, 1L, 1L, 6L, 1L), population = c(277308L, 184029L, 830781L,

1160433L, 4160L, 9509611L)), class = "data.frame", row.names = c(NA,

-6L))


st <- structure(list(terms = c("me abused depressed", "me hurt depressed",

"feel hopeless depressed", "feel alone depressed", "i feel helpless",

"i feel worthless")), row.names = c(NA, -6L), class = c("tbl_df",

"tbl", "data.frame"))

On Tue, Oct 16, 2018 at 2:39 PM Nathan Parsons 
wrote:

> Thanks all for your patience. Here’s a second go that is perhaps more
> explicative of what it is I am trying to accomplish (and hopefully in plain
> text form)...
>
>
> I’m using the following packages: tidyverse, purrr, tidytext
>
>
> I have a number of tweets in the following form:
>
>
> th <- structure(list(status_id = c("x1047841705729306624",
> "x1046966595610927105",
>
> "x1047094786610552832", "x1046988542818308097", "x1046934493553221632",
>
> "x1047227442899775488"), created_at = c("2018-10-04T13:31:45Z",
>
> "2018-10-02T03:34:22Z", "2018-10-02T12:03:45Z", "2018-10-02T05:01:35Z",
>
> "2018-10-02T01:26:49Z", "2018-10-02T20:50:53Z"), text = c("Technique is
> everything with olympic lifts ! @ Body By John https://t.co/UsfR6DafZt;,
>
> "@Subtronics just went back and rewatched ur FBlice with ur CDJs and let
> me tell you man. You are the fucking messiah",
>
> "@ic4rus1 Opportunistic means short-game. As in getting drunk now vs. not
> being hung over tomorrow vs. not fucking up your life ten years later.",
>
> "I tend to think about my dreams before I sleep.", "@MichaelAvenatti
> @SenatorCollins So, if your client was in her 20s, attending parties with
> teenagers, doesn't that make her at the least immature as hell, or at the
> worst, a pedophile and a person contributing to the delinquency of minors?",
>
> "i wish i could take credit for this"), lat = c(43.6835853, 40.284123,
>
> 37.7706565, 40.431389, 31.1688935, 33.9376735), lng = c(-70.3284118,
>
> -83.078589, -122.4359785, -79.9806895, -100.0768885, -118.130426
>
> ), county_name = c("Cumberland County", "Delaware County", "San Francisco
> County",
>
> "Allegheny County", "Concho County", "Los Angeles County"), fips =
> c(23005L,
>
> 39041L, 6075L, 42003L, 48095L, 6037L), state_name = c("Maine"

[R] Matching multiple search criteria (Unlisting a nested dataset, take 2)

2018-10-16 Thread Nathan Parsons
Thanks all for your patience. Here’s a second go that is perhaps more
explicative of what it is I am trying to accomplish (and hopefully in plain
text form)...


I’m using the following packages: tidyverse, purrr, tidytext


I have a number of tweets in the following form:


th <- structure(list(status_id = c("x1047841705729306624",
"x1046966595610927105",

"x1047094786610552832", "x1046988542818308097", "x1046934493553221632",

"x1047227442899775488"), created_at = c("2018-10-04T13:31:45Z",

"2018-10-02T03:34:22Z", "2018-10-02T12:03:45Z", "2018-10-02T05:01:35Z",

"2018-10-02T01:26:49Z", "2018-10-02T20:50:53Z"), text = c("Technique is
everything with olympic lifts ! @ Body By John https://t.co/UsfR6DafZt;,

"@Subtronics just went back and rewatched ur FBlice with ur CDJs and let me
tell you man. You are the fucking messiah",

"@ic4rus1 Opportunistic means short-game. As in getting drunk now vs. not
being hung over tomorrow vs. not fucking up your life ten years later.",

"I tend to think about my dreams before I sleep.", "@MichaelAvenatti
@SenatorCollins So, if your client was in her 20s, attending parties with
teenagers, doesn't that make her at the least immature as hell, or at the
worst, a pedophile and a person contributing to the delinquency of minors?",

"i wish i could take credit for this"), lat = c(43.6835853, 40.284123,

37.7706565, 40.431389, 31.1688935, 33.9376735), lng = c(-70.3284118,

-83.078589, -122.4359785, -79.9806895, -100.0768885, -118.130426

), county_name = c("Cumberland County", "Delaware County", "San Francisco
County",

"Allegheny County", "Concho County", "Los Angeles County"), fips = c(23005L,

39041L, 6075L, 42003L, 48095L, 6037L), state_name = c("Maine",

"Ohio", "California", "Pennsylvania", "Texas", "California"),

state_abb = c("ME", "OH", "CA", "PA", "TX", "CA"), urban_level = c("Medium
Metro",

"Large Fringe Metro", "Large Central Metro", "Large Central Metro",

"NonCore (Nonmetro)", "Large Central Metro"), urban_code = c(3L,

2L, 1L, 1L, 6L, 1L), population = c(277308L, 184029L, 830781L,

1160433L, 4160L, 9509611L)), class = c("data.table", "data.frame"

), row.names = c(NA, -6L), .internal.selfref = )


I also have a number of search terms in the following form:


st <- structure(list(terms = c("me abused depressed", "me hurt depressed",

"feel hopeless depressed", "feel alone depressed", "i feel helpless",

"i feel worthless")), row.names = c(NA, -6L), class = c("tbl_df",

"tbl", "data.frame”))


I am trying to isolate the tweets that contain all of the words in each of
the search terms, i.e “me” “abused” and “depressed” from the first example
search term, but they do not have to be in order or even next to one
another.


I am familiar with the dplyr suite of tools and have been attempting to
generate some sort of ‘filter()’ to do this. I am not very familiar with
purrr, but there may be a solution using the map function? I have also
explored the tidytext ‘unnest_tokens’ function which transforms the ’th’
data in the following way:


> tidytext::unnest_tokens(th, word, text, token = "tweets") -> tt

> head(tt)

status_id created_at lat lng

1: x1047841705729306624 2018-10-04T13:31:45Z 43.68359 -70.32841

2: x1047841705729306624 2018-10-04T13:31:45Z 43.68359 -70.32841

3: x1047841705729306624 2018-10-04T13:31:45Z 43.68359 -70.32841

4: x1047841705729306624 2018-10-04T13:31:45Z 43.68359 -70.32841

5: x1047841705729306624 2018-10-04T13:31:45Z 43.68359 -70.32841

6: x1047841705729306624 2018-10-04T13:31:45Z 43.68359 -70.32841

county_name fips state_name state_abb urban_level urban_code

1: Cumberland County 23005 Maine ME Medium Metro 3

2: Cumberland County 23005 Maine ME Medium Metro 3

3: Cumberland County 23005 Maine ME Medium Metro 3

4: Cumberland County 23005 Maine ME Medium Metro 3

5: Cumberland County 23005 Maine ME Medium Metro 3

6: Cumberland County 23005 Maine ME Medium Metro 3

population word

1: 277308 technique

2: 277308 is

3: 277308 everything

4: 277308 with

5: 277308 olympic

6: 277308 lifts


but once I have unnested the tokens, I am unable to recombine them back
into tweets.


Ideally the end result would append a new column to the ‘th’ data that
would flag a tweet that contained all of the search words for any of the
search terms; so the work flow would look like

1) look for all search words for one search term in a tweet

2) if all of the search words in the search term are found, create a flag
(mutate(flag = 1) or some such)

3) do this for all of the tweets

4) move on the next search term and repeat


Again, my thanks for your patience.


--


Nate Parsons

Pronouns: He, Him, His

Graduate Teaching Assistant

Department of Sociology

Portland State University

Portland, Oregon


503-725-9025

503-725-3957 FAX

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting 

Re: [R] Unlisting a nested dataset

2018-10-16 Thread Nathan Parsons
Ista - I provided data, code, and the error being returned as per reproducible 
r protocol. I did not include packages, however. unnest_tokens is from the 
TidyText package, map/map_chr are from purrr, and everything else is from 
tidyverse(dplyr/tidyr/etc.)

Not sure what else I can provide to make this more clear.

--

Nate Parsons
Pronouns: He, Him, His
Graduate Teaching Assistant
Department of Sociology
Portland State University
Portland, Oregon

503-725-9025
503-725-3957 FAX
On Oct 16, 2018, 12:35 PM -0700, Ista Zahn , wrote:
> Hi Nate,
>
> You've made it pretty difficult to answer your question. Please see
> https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
> and follow some of the suggestions you find there to make it easier on
> those who want to help you.
>
> Best,
> Ista
> On Mon, Oct 15, 2018 at 10:56 PM Nathan Parsons
>  wrote:
> >
> > I’m attempting to do some content analysis on a few million tweets, but I 
> > can’t seem to get them cleaned correctly.
> >
> > I’m trying to replicate the process outlined here: 
> > https://stackoverflow.com/questions/46734501/opposite-of-unnest-tokens
> >
> > My code:
> >
> > tweets %>%
> > unnest_tokens(word, text, token = 'tweets') %>%
> > filter(!word %in% stop_words$word) %>%
> > nest(word) %>%
> > mutate(text = map(data, unlist),
> > text = map_chr(text, paste, collapse = " ")) -> tweets
> >
> > Unfortunately, I keep getting:
> >
> > Error in mutate_impl(.data, dots) :
> > Evaluation error: cannot coerce type 'closure' to vector of type 
> > 'character’.
> >
> > What am I doing wrong?
> >
> > Here’s what the dataset looks like:
> >
> > > glimpse(tweets)
> > Observations: 389,253
> > Variables: 12
> > $ status_id "x1047841705729306624", "x1046966595610927105", "x104709...
> > $ created_at "2018-10-04T13:31:45Z", "2018-10-02T03:34:22Z", "2018-10...
> > $ text "Technique is everything with olympic lifts ! @ Body By ...
> > $ lat 43.68359, 40.28412, 37.77066, 40.43139, 31.16889, 33.937...
> > $ lng -70.32841, -83.07859, -122.43598, -79.98069, -100.07689,...
> > $ county_name "Cumberland County", "Delaware County", "San Francisco C...
> > $ fips 23005, 39041, 6075, 42003, 48095, 6037, 6037, 55073, 482...
> > $ state_name "Maine", "Ohio", "California", "Pennsylvania", "Texas", ...
> > $ state_abb "ME", "OH", "CA", "PA", "TX", "CA", "CA", "WI", "TX", "A...
> > $ urban_level "Medium Metro", "Large Fringe Metro", "Large Central Met...
> > $ urban_code 3, 2, 1, 1, 6, 1, 1, 4, 1, 3, 2, 2, 1, 3, 6, 1, 1, 2, 3,...
> > $ population 277308, 184029, 830781, 1160433, 4160, 9509611, 9509611,...
> >
> > --
> >
> > Nate Parsons
> > Pronouns: He, Him, His
> > Graduate Teaching Assistant
> > Department of Sociology
> > Portland State University
> > Portland, Oregon
> >
> > 503-725-9025
> > 503-725-3957 FAX
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Unlisting a nested dataset

2018-10-15 Thread Nathan Parsons
I’m attempting to do some content analysis on a few million tweets, but I can’t 
seem to get them cleaned correctly.

I’m trying to replicate the process outlined here: 
https://stackoverflow.com/questions/46734501/opposite-of-unnest-tokens

My code:

tweets %>%
 unnest_tokens(word, text, token = 'tweets') %>%
 filter(!word %in% stop_words$word) %>%
 nest(word) %>%
 mutate(text = map(data, unlist),
           text = map_chr(text, paste, collapse = " ")) -> tweets

Unfortunately, I keep getting:

 Error in mutate_impl(.data, dots) :
 Evaluation error: cannot coerce type 'closure' to vector of type 'character’.

What am I doing wrong?

Here’s what the dataset looks like:

> glimpse(tweets)
Observations: 389,253
Variables: 12
$ status_id "x1047841705729306624", "x1046966595610927105", "x104709...
$ created_at "2018-10-04T13:31:45Z", "2018-10-02T03:34:22Z", "2018-10...
$ text "Technique is everything with olympic lifts ! @ Body By ...
$ lat 43.68359, 40.28412, 37.77066, 40.43139, 31.16889, 33.937...
$ lng -70.32841, -83.07859, -122.43598, -79.98069, -100.07689,...
$ county_name "Cumberland County", "Delaware County", "San Francisco C...
$ fips 23005, 39041, 6075, 42003, 48095, 6037, 6037, 55073, 482...
$ state_name "Maine", "Ohio", "California", "Pennsylvania", "Texas", ...
$ state_abb "ME", "OH", "CA", "PA", "TX", "CA", "CA", "WI", "TX", "A...
$ urban_level "Medium Metro", "Large Fringe Metro", "Large Central Met...
$ urban_code 3, 2, 1, 1, 6, 1, 1, 4, 1, 3, 2, 2, 1, 3, 6, 1, 1, 2, 3,...
$ population 277308, 184029, 830781, 1160433, 4160, 9509611, 9509611,...

--

Nate Parsons
Pronouns: He, Him, His
Graduate Teaching Assistant
Department of Sociology
Portland State University
Portland, Oregon

503-725-9025
503-725-3957 FAX

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating a data frame

2018-10-09 Thread Nathan Parsons
Please post both the code you are using and the error, Abigail.

Nate
On Oct 9, 2018, 11:59 AM -0700, Friedman, Abigail , 
wrote:
> I keep getting error messages when running my data frame code and I cannot 
> figure out what I am doing wrong.
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.