---------- Forwarded message ----------
From: Natalie Van Zuydam <nvanzuy...@gmail.com>
Date: Fri, Mar 12, 2010 at 5:49 PM
Subject: Re: [R] sqldf not joining all the fields
To: David Winsemius <dwinsem...@comcast.net>


Dear David

I'm not sure what the problem is as for every item code there is a
corresponding information in the y_data.  For example 842752 from the x_data
corresponds to Aspirin in the y_data?  Yet when I use sqldf to join the two
df's I get NA values in the columns from the y_data in z for 842752 item
code....is there something wrong with my sqldf code or something wrong with
the way I have inputed the data frames?

Thanks for taking the time to help me,
Natalie



On Fri, Mar 12, 2010 at 5:42 PM, David Winsemius <dwinsem...@comcast.net>wrote:

> If I assign the file input to y_data and change you sqldf to
>
> > z  <- sqldf("select * from x_data left join y_data using (item_code)"); z
>
> I can replicate your result. Even after changing the types of the two
> item_code fields to match I still get the same result and when I see to what
> degree they share values I get:
>
> > sum(x_data$item_code %in% y_data$item_code)
> [1] 2
> > sum(y_data$item_code %in% x_data$item_code)
> [1] 2
>
>
> So why are you so sure they are "complete" as you claimed in your first
> email.
>
> --
> David.
>
>
>
> On Mar 12, 2010, at 12:29 PM, David Winsemius wrote:
>
>  You have now given two different assignments to x_data and none to y_data:
>>
>> The str( from the file access offering:
>>
>> > str(x_data)
>> 'data.frame':   2848 obs. of  5 variables:
>> $ item_code       : int  100 110 150 160 161 164 200 210 212 220 ...
>> $ name            : chr  "NEONACLEX K" "NEONACLEX" "MESORB" "ABSORBENT
>> CELLULOSE MESO" ...
>> $ formulation_code: chr  "TABS" "TABS" "DRESS" "DRESS" ...
>> $ strength        : chr  "NULL" "5MG" "10CMX10CM" "10CMX10CM" ...
>> $ bnf_code        : chr  "2.2.8" "2.2.1" "20.3.1" "20.3.1" ...
>>
>> The str from assignment from the dput offering
>> > str(x_data)
>> 'data.frame':   10 obs. of  10 variables:
>> $ prochi         : chr  "CAO0000713" "CAO0000713" "CAO0000713"
>> "CAO0000713" ...
>> $ prescribed_date: chr  "22/06/2001" "28/04/2000" "10/04/2000"
>> "03/07/2000" ...
>> $ dataMonth      : chr  "NULL" "NULL" "NULL" "NULL" ...
>> $ item_code      : chr  "842752" "7800" "842652" "842652" ...
>> $ res_seqno      : chr  "NULL" "NULL" "NULL" "NULL" ...
>> $ quantity       : chr  "60" "100G" "60" "60" ...
>> $ directions     : chr  "1/D" "A/TD" "1/D" "1/D" ...
>> $ no_of_packs    : chr  "NULL" "NULL" "NULL" "NULL" ...
>> $ datasource     : chr  "TSF" "TSF" "TSF" "TSF" ...
>> $ scan_ref_no    : chr  "NULL" "NULL" "NULL" "NULL" ...
>>
>> This code "worked", but it is not clear that the x-y assignments were
>> correct:
>>
>> x_data <- read.table(file="
>> http://n4.nabble.com/file/n1590804/feb09_267_presc_items_tsf.txt";, header
>> = TRUE, sep = "|", quote = "\"'",
>>              dec = ".",as.is = TRUE,na.strings = "NA",colClasses = NA,
>> nrows = 3864284,
>>              skip = 0, check.names = TRUE,fill=TRUE,
>>              strip.white = TRUE, blank.lines.skip = TRUE,
>>              comment.char = "#", allowEscapes = FALSE, flush = FALSE,
>>              fileEncoding = "", encoding = "unknown")
>>
>> --
>> David.
>>
>> On Mar 12, 2010, at 12:23 PM, Newbie19_02 wrote:
>>
>>
>>> The y_data file has over 9000 rows in it so I thought it would be more
>>> practical to give you the file to download....
>>> --
>>> View this message in context:
>>> http://n4.nabble.com/sqldf-not-joining-all-the-fields-tp1590786p1590833.html
>>> Sent from the R help mailing list archive at Nabble.com.
>>>
>>> ______________________________________________
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> David Winsemius, MD
>> West Hartford, CT
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> David Winsemius, MD
> West Hartford, CT
>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to