On Fri, Jul 18, 2014 at 11:17 AM, John McKown <john.archie.mck...@gmail.com> wrote: > Well, this was a shock to me. And I don't really see any documentation > about it, but perhaps I just can't see it. > >>"abc" == "abc " > [1] FALSE > > I guess that I thought of strings in R like I do is some other > languages where the shorter value is padded with blanks to the length > of the longer value, then compared. I.e. that trailing blanks didn't > matter. > > The best solution that I have found is to use the str_trim() function > from the stringr to remove all the trailing blanks after I get the > data from the SQL data base. I cannot change the SQL schema to make > the column a varchar instead of a char column. It is a vendor DB. And > I don't know an ANSI SQL standard way to remove trailing blanks in the > SELECT command. PostgreSQL has a "trim(trailing ' ' from column)', but > MS-SQL upchucks on that syntax. >
Well, here I am - talking to myself ... again. My "problem" was, of course, of my own making. I am getting my data via RODBC from MS-SQL Server. I was basically doing a "SELECT * FROM TABLE". I normally use PostgreSQL, not MS-SQL, and I tend to use the "TEXT" data type instead of CHAR or VARCHAR. So when I do the SELECT, I get back my data without trailing blanks. Well, the data I am reading now is created by a software vendor. I guess in order to be database independent, the vendor designed his tables to have only fixed length CHAR, and INT values in it. The fixed length CHAR values are, naturally, padded on the right with blanks. Of course, now that I understand this (weird as it is to me), I know to use a SELECT which specifically lists the columns that I want _and_ does a TRIM() on them to remove trailing blanks. This will reduce the size, in bytes, in my data.frame and make it easier to use the comparison operators. Given how the vendor saves the data, I am quite surprised that they didn't use SQLite. The tables are simple. There are no "stored procedures", no VIEWs, no use of SCHEMAs to make subsets. Basically they just want a simple data store, with the ability to do _simple_ joins. SQLite seems, to me, to be a better fit than requiring the user to have a full blown RDMS such as MS-SQL or Oracle. Well, thanks for the whack on the head to wake me up and make me really look at my data. -- There is nothing more pleasant than traveling and meeting new people! Genghis Khan Maranatha! <>< John McKown ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.