You need to use the JSON library or equivalent to solve this problem. I don't understand why you think that having the data in the clipboard prevents you from doing this since that is just another file (but I usually avoid using the clipboard for reproducible analysis anyway). --------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnew...@dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --------------------------------------------------------------------------- Sent from my phone. Please excuse my brevity.
On February 14, 2014 1:29:59 AM PST, Mark Stam <digis...@gmail.com> wrote: >Hello, > >I do data analysis on json data (Twitter). An example of the data: > >********** >" \"id\": 433662713886429200," >" \"id_str\": \"433662713886429184\"," >" \"text\": \"Hond vast in water in Bargerveen bij Zwartemeer - >http://t.co/FqbkOMzYd1 #Zwartemeer #bargerveen #hond #innood\"," >" \"source\": \"<a >href=\"https://about.twitter.com/products/tweetdeck\" >rel=\"nofollow\">TweetDeck</a>\"," >********** > >I get the contents of the "text" field like this: > >r <- regexpr("^( )*\"text(.*?),$", myjsondata) >text <- regmatches(myjsondata,r) >txt <- gsub("\"text\":|\",|\"","",text) > >Unfortunately, in json there are more fields with the same name, for >example: > >********** >" \"id\": 433662713886429200," >" \"id_str\": \"433662713886429184\"," >" \"text\": \"Hond vast in water in Bargerveen bij Zwartemeer - >http://t.co/FqbkOMzYd1 #Zwartemeer #bargerveen #hond #innood\"," >" \"source\": \"<a >href=\"https://about.twitter.com/products/tweetdeck\" >rel=\"nofollow\">TweetDeck</a>\"," >... >" \"entities\": {" > > >" \"hashtags\": [" > > >" {" > > >" \"text\": \"Zwartemeer\"," >... >" \"text\": \"bargerveen\"," > > >... >" \"text\": \"hond\"," >etc. >********** > >I only want to get the data from the text field between the "id_str" >and >the "source" fields. I don't want to have the data from the text fields >below "hashtags". I do understand regex, but I don't understand how to >do >it with the criteria from multiple lines. > >I know it's possible to use a Json library in R, but in my case I >can't, >because I get the json from raw "clipboard" data. > >Thanks ! > >Mark Stam > > [[alternative HTML version deleted]] > >______________________________________________ >R-help@r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.