Re: [R] Staging area for data before read into R

2008-10-21 Thread Greg Snow
Stephen,

One of the big problems with spreadsheets (other than the column limit in some) 
is that the standard entry mode allows too much flexibility which does nothing 
to help you avoid data entry errors.  The Webpage: 
http://www.burns-stat.com/pages/Tutor/spreadsheet_addiction.html has some 
examples of this going wrong, including one that happened to my group where the 
column for dates was not preformatted, the dates were entered using European 
format, and Excel did 2 different wrong things with them making it very 
difficult to do anything with the data without major extra work.  If you are 
going to stick with a spreadsheet, then at a minimum you should start by naming 
all your columns, then formatting each column based on the type of data you 
expect to be entered there.

Going the database route is not that much to learn to get started.  You can use 
MSAccess, or the OpenOffice database, create a new table and enter the names of 
each column along with the data type (this is a big advantage in that it will 
not allow you to enter character data where numbers are expected, forces dates 
to look like dates, etc.).  It is not that much extra work to enter valid 
levels for what will become factors (e.g. Male and Female for sex, so that 
those are the only values allowed, my current record for datasets entered by 
others using spreadsheets is 9 sexes).  Then you can pick up more as you go 
along, but setting up the first database to enter data should only take you an 
hour or so to learn the basics.

Another option is to just use R, the following code gives one approach that 
could get you started entering data:

tmp - rep( list(character(0), numeric(0)), c(2,5) )
names(tmp) - c( 'ID','Sex', paste('Stream', 1:5, sep='') )
tmp - as.data.frame(tmp)
levels(tmp$Sex) - c(Female,Male)
tmp$ID - as.character(tmp$ID)

mydata - edit(tmp)

Hope this helps,

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
801.408.8111


 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 project.org] On Behalf Of stephen sefick
 Sent: Monday, October 20, 2008 7:02 PM
 To: R Help
 Subject: Re: [R] Staging area for data before read into R

 Well, I am going to type in ever value because the data sheets are of
 counts of insects that I identified, so I should be okay with
 accuracy...  I really just need something that allows for more than
 256 columns as I have encounter over 256 species of insects in even
 small streams.  I think calc with it's 1000ish columns will do the
 trick... thanks everbody for your help.

 On Mon, Oct 20, 2008 at 8:25 PM, Gabor Grothendieck
 [EMAIL PROTECTED] wrote:
  There is a list of free spreadsheets with their row and column limits
  at this link:
  http://en.wikipedia.org/wiki/OpenOffice.org_Calc
 
  On Mon, Oct 20, 2008 at 3:13 PM, stephen sefick [EMAIL PROTECTED]
 wrote:
  sorry excel 2003 with no immediate update in the future.
 
  On Mon, Oct 20, 2008 at 3:12 PM, Gabor Grothendieck
  [EMAIL PROTECTED] wrote:
  You didn't say which version of Excel you are using but Excel 2007
  allows 16,384 columns.
 
  On Mon, Oct 20, 2008 at 2:27 PM, stephen sefick [EMAIL PROTECTED]
 wrote:
  I am wondering if there is a better alternative than Excel for
 data
  storage that does not require database knowledge (I will
 eventually
  have to learn this, but it is not on my immediate todo list).  I
 need
  something that is not limited to 256 columns... I don't need any
 of
  the built in functions in excel just a spreadsheet like program
 with
  cells that hold data in a data.frame format for a staging area
 before
  I get it into R.  Any help would be greatly appreciated.  This is
 not
  a direct r question, but all of you folks have more experience
 than I
  do and I am having a time finding what I need with google.
  thanks in advance
 
  --
  Stephen Sefick
  Research Scientist
  Southeastern Natural Sciences Academy
 
  Let's not spend our time and resources thinking about things that
 are
  so little or so large that all they really do for us is puff us up
 and
  make us feel like gods.  We are mammals, and have not exhausted
 the
  annoying little problems of being mammals.
 
 -K.
 Mullis
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 
  --
  Stephen Sefick
  Research Scientist
  Southeastern Natural Sciences Academy
 
  Let's not spend our time and resources thinking about things that
 are
  so little or so large that all they really do for us is puff us up
 and
  make us feel like gods.  We are mammals, and have not exhausted the
  annoying little problems of being mammals

Re: [R] Staging area for data before read into R

2008-10-21 Thread Christopher W. Ryan
this is also the sort of thing that EpiData does very well.  That's what
it was designed for:  data entry with minimal errors.  Also simplifies
double data entry for error checking, if you need/want that.


--Chris
Christopher W. Ryan, MD
SUNY Upstate Medical University Clinical Campus at Binghamton
40 Arch Street, Johnson City, NY  13790
cryanatbinghamtondotedu
PGP public keys available at http://home.stny.rr.com/ryancw/



Greg Snow wrote:
 Stephen,
 
 . . . . .
 
 Going the database route is not that much to learn to get started.  You can 
 use MSAccess, or the OpenOffice database, create a new table and enter the 
 names of each column along with the data type (this is a big advantage in 
 that it will not allow you to enter character data where numbers are 
 expected, forces dates to look like dates, etc.).  It is not that much extra 
 work to enter valid levels for what will become factors (e.g. Male and 
 Female for sex, so that those are the only values allowed, my current 
 record for datasets entered by others using spreadsheets is 9 sexes).  Then 
 you can pick up more as you go along, but setting up the first database to 
 enter data should only take you an hour or so to learn the basics.

 
 --
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 [EMAIL PROTECTED]
 801.408.8111


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Staging area for data before read into R

2008-10-21 Thread Gabor Grothendieck
Excel has a data validation facility and also has data input forms to
facilitate data entry.

On Tue, Oct 21, 2008 at 1:45 PM, Greg Snow [EMAIL PROTECTED] wrote:
 Stephen,

 One of the big problems with spreadsheets (other than the column limit in 
 some) is that the standard entry mode allows too much flexibility which does 
 nothing to help you avoid data entry errors.  The Webpage: 
 http://www.burns-stat.com/pages/Tutor/spreadsheet_addiction.html has some 
 examples of this going wrong, including one that happened to my group where 
 the column for dates was not preformatted, the dates were entered using 
 European format, and Excel did 2 different wrong things with them making it 
 very difficult to do anything with the data without major extra work.  If you 
 are going to stick with a spreadsheet, then at a minimum you should start by 
 naming all your columns, then formatting each column based on the type of 
 data you expect to be entered there.

 Going the database route is not that much to learn to get started.  You can 
 use MSAccess, or the OpenOffice database, create a new table and enter the 
 names of each column along with the data type (this is a big advantage in 
 that it will not allow you to enter character data where numbers are 
 expected, forces dates to look like dates, etc.).  It is not that much extra 
 work to enter valid levels for what will become factors (e.g. Male and 
 Female for sex, so that those are the only values allowed, my current 
 record for datasets entered by others using spreadsheets is 9 sexes).  Then 
 you can pick up more as you go along, but setting up the first database to 
 enter data should only take you an hour or so to learn the basics.

 Another option is to just use R, the following code gives one approach that 
 could get you started entering data:

 tmp - rep( list(character(0), numeric(0)), c(2,5) )
 names(tmp) - c( 'ID','Sex', paste('Stream', 1:5, sep='') )
 tmp - as.data.frame(tmp)
 levels(tmp$Sex) - c(Female,Male)
 tmp$ID - as.character(tmp$ID)

 mydata - edit(tmp)

 Hope this helps,

 --
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 [EMAIL PROTECTED]
 801.408.8111


 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 project.org] On Behalf Of stephen sefick
 Sent: Monday, October 20, 2008 7:02 PM
 To: R Help
 Subject: Re: [R] Staging area for data before read into R

 Well, I am going to type in ever value because the data sheets are of
 counts of insects that I identified, so I should be okay with
 accuracy...  I really just need something that allows for more than
 256 columns as I have encounter over 256 species of insects in even
 small streams.  I think calc with it's 1000ish columns will do the
 trick... thanks everbody for your help.

 On Mon, Oct 20, 2008 at 8:25 PM, Gabor Grothendieck
 [EMAIL PROTECTED] wrote:
  There is a list of free spreadsheets with their row and column limits
  at this link:
  http://en.wikipedia.org/wiki/OpenOffice.org_Calc
 
  On Mon, Oct 20, 2008 at 3:13 PM, stephen sefick [EMAIL PROTECTED]
 wrote:
  sorry excel 2003 with no immediate update in the future.
 
  On Mon, Oct 20, 2008 at 3:12 PM, Gabor Grothendieck
  [EMAIL PROTECTED] wrote:
  You didn't say which version of Excel you are using but Excel 2007
  allows 16,384 columns.
 
  On Mon, Oct 20, 2008 at 2:27 PM, stephen sefick [EMAIL PROTECTED]
 wrote:
  I am wondering if there is a better alternative than Excel for
 data
  storage that does not require database knowledge (I will
 eventually
  have to learn this, but it is not on my immediate todo list).  I
 need
  something that is not limited to 256 columns... I don't need any
 of
  the built in functions in excel just a spreadsheet like program
 with
  cells that hold data in a data.frame format for a staging area
 before
  I get it into R.  Any help would be greatly appreciated.  This is
 not
  a direct r question, but all of you folks have more experience
 than I
  do and I am having a time finding what I need with google.
  thanks in advance
 
  --
  Stephen Sefick
  Research Scientist
  Southeastern Natural Sciences Academy
 
  Let's not spend our time and resources thinking about things that
 are
  so little or so large that all they really do for us is puff us up
 and
  make us feel like gods.  We are mammals, and have not exhausted
 the
  annoying little problems of being mammals.
 
 -K.
 Mullis
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 
  --
  Stephen Sefick
  Research Scientist
  Southeastern Natural Sciences Academy
 
  Let's not spend our time and resources thinking about things that
 are
  so little or so large that all

Re: [R] Staging area for data before read into R

2008-10-21 Thread Ted Byers
 column
 based on the type of data you expect to be entered there.

 Going the database route is not that much to learn to get started.  You
 can use MSAccess, or the OpenOffice database, create a new table and
 enter the names of each column along with the data type (this is a big
 advantage in that it will not allow you to enter character data where
 numbers are expected, forces dates to look like dates, etc.).  It is not
 that much extra work to enter valid levels for what will become factors
 (e.g. Male and Female for sex, so that those are the only values
 allowed, my current record for datasets entered by others using
 spreadsheets is 9 sexes).  Then you can pick up more as you go along, but
 setting up the first database to enter data should only take you an hour
 or so to learn the basics.

 Another option is to just use R, the following code gives one approach
 that could get you started entering data:

 tmp - rep( list(character(0), numeric(0)), c(2,5) )
 names(tmp) - c( 'ID','Sex', paste('Stream', 1:5, sep='') )
 tmp - as.data.frame(tmp)
 levels(tmp$Sex) - c(Female,Male)
 tmp$ID - as.character(tmp$ID)

 mydata - edit(tmp)

 Hope this helps,

 --
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 [EMAIL PROTECTED]
 801.408.8111


 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 project.org] On Behalf Of stephen sefick
 Sent: Monday, October 20, 2008 7:02 PM
 To: R Help
 Subject: Re: [R] Staging area for data before read into R

 Well, I am going to type in ever value because the data sheets are of
 counts of insects that I identified, so I should be okay with
 accuracy...  I really just need something that allows for more than
 256 columns as I have encounter over 256 species of insects in even
 small streams.  I think calc with it's 1000ish columns will do the
 trick... thanks everbody for your help.

 On Mon, Oct 20, 2008 at 8:25 PM, Gabor Grothendieck
 [EMAIL PROTECTED] wrote:
  There is a list of free spreadsheets with their row and column limits
  at this link:
  http://en.wikipedia.org/wiki/OpenOffice.org_Calc
 
  On Mon, Oct 20, 2008 at 3:13 PM, stephen sefick [EMAIL PROTECTED]
 wrote:
  sorry excel 2003 with no immediate update in the future.
 
  On Mon, Oct 20, 2008 at 3:12 PM, Gabor Grothendieck
  [EMAIL PROTECTED] wrote:
  You didn't say which version of Excel you are using but Excel 2007
  allows 16,384 columns.
 
  On Mon, Oct 20, 2008 at 2:27 PM, stephen sefick [EMAIL PROTECTED]
 wrote:
  I am wondering if there is a better alternative than Excel for
 data
  storage that does not require database knowledge (I will
 eventually
  have to learn this, but it is not on my immediate todo list).  I
 need
  something that is not limited to 256 columns... I don't need any
 of
  the built in functions in excel just a spreadsheet like program
 with
  cells that hold data in a data.frame format for a staging area
 before
  I get it into R.  Any help would be greatly appreciated.  This is
 not
  a direct r question, but all of you folks have more experience
 than I
  do and I am having a time finding what I need with google.
  thanks in advance
 
  --
  Stephen Sefick
  Research Scientist
  Southeastern Natural Sciences Academy
 
  Let's not spend our time and resources thinking about things that
 are
  so little or so large that all they really do for us is puff us up
 and
  make us feel like gods.  We are mammals, and have not exhausted
 the
  annoying little problems of being mammals.
 
 -K.
 Mullis
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 
  --
  Stephen Sefick
  Research Scientist
  Southeastern Natural Sciences Academy
 
  Let's not spend our time and resources thinking about things that
 are
  so little or so large that all they really do for us is puff us up
 and
  make us feel like gods.  We are mammals, and have not exhausted the
  annoying little problems of being mammals.
 
 -K.
 Mullis
 
 



 --
 Stephen Sefick
 Research Scientist
 Southeastern Natural Sciences Academy

 Let's not spend our time and resources thinking about things that are
 so little or so large that all they really do for us is puff us up and
 make us feel like gods.  We are mammals, and have not exhausted the
 annoying little problems of being mammals.

 -K.
 Mullis

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained

Re: [R] Staging area for data before read into R

2008-10-21 Thread Rolf Turner


On 22/10/2008, at 8:18 AM, Ted Byers wrote:

snip


... even with all the power and utility of Excel ...


snip

Is this some kind of joke?

cheers,

Rolf Turner

##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Staging area for data before read into R

2008-10-21 Thread Gabor Grothendieck
On Tue, Oct 21, 2008 at 3:18 PM, Ted Byers [EMAIL PROTECTED] wrote:
 There are tradeoffs no matter what route you take.
 You can do validation in Access as you can in Excel, but Excel is not
 designed to manage data where Access is, and both are crippled by their
 dependance on VB (a seriouusly broken language: fine for scripting MS

Excel can do validation without VB.  For example, you can restrict
data to a certain range of dates, limit choices by using a list, or
make sure that only positive whole numbers are entered all without
any VB.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Staging area for data before read into R

2008-10-21 Thread Ted Byers

I wasn't suggesting that the validation requires VB.  

Creating forms and handling form events does (unless MS has introduced new
utilities to hide all that since last I used it).

Some of the most interesting things I have seen done with Excel did involve
VB, and there are better tools to do most of those things.

Gabor Grothendieck wrote:
 
 On Tue, Oct 21, 2008 at 3:18 PM, Ted Byers [EMAIL PROTECTED] wrote:
 There are tradeoffs no matter what route you take.
 You can do validation in Access as you can in Excel, but Excel is not
 designed to manage data where Access is, and both are crippled by their
 dependance on VB (a seriouusly broken language: fine for scripting MS
 
 Excel can do validation without VB.  For example, you can restrict
 data to a certain range of dates, limit choices by using a list, or
 make sure that only positive whole numbers are entered all without
 any VB.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Staging-area-for-data-before-read-into-R-tp20075962p20099445.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Staging area for data before read into R

2008-10-21 Thread Ted Byers

No.  Excel, like most spreadsheets, does what it designed for reasonably
well.  It is easy to find fault, but not so easy to satisfy all one's
critics.  There is no doubt that Excel has faults, but it provides
significant modelling and analysis capability to users with no programming
expertise or limited experience using IT.

I have used it as a teaching tool for very basic modelling to undergraduate
students who would not have been able to do any modelling without it.  In a
one session course, there just isn't time to teach students enough
programming in any language for them to have a hope of producing an
interesting model.  But they can produce an interesting model with some
guidance using Excel.  Similarly, they can do elementary data analysis
entering their data into Excel and using it to analyse it.

Excel was designed primarily for business people, and I have seen them use
it effectively, doing things I don't fully understand (as I am not a
businessman).  But these same people would go into a catatonic state the
moment a discussion becomes technical or mathematical.  They describe Excel
as powerful, and until I become an expert MBA type, I won't knock them for
that.  If they find it useful, why would I argue with them.  

Don't get me wrong, I do not normally use it, and for 99% of the work I do,
it provides no value to me, so I do not have it installed on my own systems. 
I am better served by C++, Java, and the related tools specific to my work. 
But that it isn't useful to me, or apparently you, is not sufficient grounds
to question its utility for others (neither is the existance of bugs, as ALL
software has bugs: MS makes for an easy target, but I try to be as fair to
them as I am to an independant developer who works alone - lets not have
this degenerate into an attack on MS, please).  As a software engineer
myself, I won't knock the work of another just because what he's produced
isn't particularly useful for me.  I won't even knock him if I don't agree
with the design decisions he's made.  When that happens, it is likely I was
not part of his intended market: nothing more can be implied.


Rolf Turner-3 wrote:
 
 
 On 22/10/2008, at 8:18 AM, Ted Byers wrote:
 
   snip
 
 ... even with all the power and utility of Excel ...
 
   snip
 
 Is this some kind of joke?
 
   cheers,
 
   Rolf Turner
 
 ##
 Attention:\ This e-mail message is privileged and confid...{{dropped:9}}
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Staging-area-for-data-before-read-into-R-tp20075962p20099848.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Staging area for data before read into R

2008-10-21 Thread Gabor Grothendieck
You can create data entry forms without VB in Excel too.

On Tue, Oct 21, 2008 at 5:09 PM, Ted Byers [EMAIL PROTECTED] wrote:

 I wasn't suggesting that the validation requires VB.

 Creating forms and handling form events does (unless MS has introduced new
 utilities to hide all that since last I used it).

 Some of the most interesting things I have seen done with Excel did involve
 VB, and there are better tools to do most of those things.

 Gabor Grothendieck wrote:

 On Tue, Oct 21, 2008 at 3:18 PM, Ted Byers [EMAIL PROTECTED] wrote:
 There are tradeoffs no matter what route you take.
 You can do validation in Access as you can in Excel, but Excel is not
 designed to manage data where Access is, and both are crippled by their
 dependance on VB (a seriouusly broken language: fine for scripting MS

 Excel can do validation without VB.  For example, you can restrict
 data to a certain range of dates, limit choices by using a list, or
 make sure that only positive whole numbers are entered all without
 any VB.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 View this message in context: 
 http://www.nabble.com/Staging-area-for-data-before-read-into-R-tp20075962p20099445.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Staging area for data before read into R

2008-10-21 Thread Ted Byers
Ah, OK.  That is new since I used Excel last.

Thanks

On Tue, Oct 21, 2008 at 5:52 PM, Gabor Grothendieck
[EMAIL PROTECTED] wrote:
 You can create data entry forms without VB in Excel too.

 On Tue, Oct 21, 2008 at 5:09 PM, Ted Byers [EMAIL PROTECTED] wrote:

 I wasn't suggesting that the validation requires VB.

 Creating forms and handling form events does (unless MS has introduced new
 utilities to hide all that since last I used it).

 Some of the most interesting things I have seen done with Excel did involve
 VB, and there are better tools to do most of those things.

 Gabor Grothendieck wrote:

 On Tue, Oct 21, 2008 at 3:18 PM, Ted Byers [EMAIL PROTECTED] wrote:
 There are tradeoffs no matter what route you take.
 You can do validation in Access as you can in Excel, but Excel is not
 designed to manage data where Access is, and both are crippled by their
 dependance on VB (a seriouusly broken language: fine for scripting MS

 Excel can do validation without VB.  For example, you can restrict
 data to a certain range of dates, limit choices by using a list, or
 make sure that only positive whole numbers are entered all without
 any VB.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 View this message in context: 
 http://www.nabble.com/Staging-area-for-data-before-read-into-R-tp20075962p20099445.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Staging area for data before read into R

2008-10-21 Thread Gabor Grothendieck
It was in Excel 2003 too so you must not have used Excel for years.

On Tue, Oct 21, 2008 at 6:36 PM, Ted Byers [EMAIL PROTECTED] wrote:
 Ah, OK.  That is new since I used Excel last.

 Thanks

 On Tue, Oct 21, 2008 at 5:52 PM, Gabor Grothendieck
 [EMAIL PROTECTED] wrote:
 You can create data entry forms without VB in Excel too.

 On Tue, Oct 21, 2008 at 5:09 PM, Ted Byers [EMAIL PROTECTED] wrote:

 I wasn't suggesting that the validation requires VB.

 Creating forms and handling form events does (unless MS has introduced new
 utilities to hide all that since last I used it).

 Some of the most interesting things I have seen done with Excel did involve
 VB, and there are better tools to do most of those things.

 Gabor Grothendieck wrote:

 On Tue, Oct 21, 2008 at 3:18 PM, Ted Byers [EMAIL PROTECTED] wrote:
 There are tradeoffs no matter what route you take.
 You can do validation in Access as you can in Excel, but Excel is not
 designed to manage data where Access is, and both are crippled by their
 dependance on VB (a seriouusly broken language: fine for scripting MS

 Excel can do validation without VB.  For example, you can restrict
 data to a certain range of dates, limit choices by using a list, or
 make sure that only positive whole numbers are entered all without
 any VB.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 View this message in context: 
 http://www.nabble.com/Staging-area-for-data-before-read-into-R-tp20075962p20099445.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Staging area for data before read into R

2008-10-21 Thread Dr Eberhard W Lisse

I fully agree, with important or large data sets you can not be
paranoid enough.

linux and the mac allow you to easily write scripts that handle
dumping, zipping, copying (locally and elsewhere) and verifying
the data. Once written correctly and tested they can run fully
automatic with cron. Been doing this for 15 years.


And where you are advised to burn 2 DVDs, burn 5 each. Read the data
on at least two different hardwares and operating systems.  Send at
least one of each by courier to a collaborating colleague on a
different continent.

As they say, different hard disk, differenr power supply, different
earthquake :-)-O

el

On 21 Oct 2008, at 21:18 , Ted Byers wrote:





[...]



Dr. Snow is right in recommending going the route of
using an RDBMS and in saying that it isn't that hard to get  
started.  I'd be
recommending PostgreSQL, though, since it is relatively easy to use,  
and it
has pl/r (which lets you run R code within stored procedures in the  
DB)

which carries obvious advantages.


[...]

If I were in his place, I'd say my data is sacred, and can not be  
replaced
(just as you can't step into the same stream twice); and therefore  
I'd use a
RDBMS to manage it, and the very moment it is all entered, I'd make  
a backup
of both the data (e.g. in MySQL I'd use mysqldump) AND the software,  
and
copy both backups to two CDs or DVDs.  And, if the data were  
originally
recorded on paper, I'd be scanning the pages and copying those  
images onto a
couple CDs or DVDs also: with two copies on optical media, one copy  
can be
stored in a fireproof vault while the other is in the office ready  
to be
used should a HDD fail, or some other disaster interrupt my work.   
OK, so

I'm paranoid about my data, but I'd rather go the extra mile than risk
losing it.





--
Dr. Eberhard W. Lisse  \/ Obstetrician  Gynaecologist (Saar)
[EMAIL PROTECTED] el108-ARIN /   *   |   Telephone: +264 81 124 6733 (cell)
PO Box 8421 \ / Please send DNS/NA-NiC related e-mail
Bachbrecht, Namibia ;/ to [EMAIL PROTECTED]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Staging area for data before read into R

2008-10-20 Thread stephen sefick
I am wondering if there is a better alternative than Excel for data
storage that does not require database knowledge (I will eventually
have to learn this, but it is not on my immediate todo list).  I need
something that is not limited to 256 columns... I don't need any of
the built in functions in excel just a spreadsheet like program with
cells that hold data in a data.frame format for a staging area before
I get it into R.  Any help would be greatly appreciated.  This is not
a direct r question, but all of you folks have more experience than I
do and I am having a time finding what I need with google.
thanks in advance

-- 
Stephen Sefick
Research Scientist
Southeastern Natural Sciences Academy

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods.  We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Staging area for data before read into R

2008-10-20 Thread stephen sefick
sorry excel 2003 with no immediate update in the future.

On Mon, Oct 20, 2008 at 3:12 PM, Gabor Grothendieck
[EMAIL PROTECTED] wrote:
 You didn't say which version of Excel you are using but Excel 2007
 allows 16,384 columns.

 On Mon, Oct 20, 2008 at 2:27 PM, stephen sefick [EMAIL PROTECTED] wrote:
 I am wondering if there is a better alternative than Excel for data
 storage that does not require database knowledge (I will eventually
 have to learn this, but it is not on my immediate todo list).  I need
 something that is not limited to 256 columns... I don't need any of
 the built in functions in excel just a spreadsheet like program with
 cells that hold data in a data.frame format for a staging area before
 I get it into R.  Any help would be greatly appreciated.  This is not
 a direct r question, but all of you folks have more experience than I
 do and I am having a time finding what I need with google.
 thanks in advance

 --
 Stephen Sefick
 Research Scientist
 Southeastern Natural Sciences Academy

 Let's not spend our time and resources thinking about things that are
 so little or so large that all they really do for us is puff us up and
 make us feel like gods.  We are mammals, and have not exhausted the
 annoying little problems of being mammals.

-K. Mullis

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Stephen Sefick
Research Scientist
Southeastern Natural Sciences Academy

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods.  We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Staging area for data before read into R

2008-10-20 Thread Ted Byers

Define better.

Really, it depends on what you need to do (are all your data appropriately
represented in a 2D array?) and what resources are available.  If all your
data can be represented using a 2D array, then Excel is probably your best
bet for th enear term.  If not, you might as well bite the bullit and learn
to use an RDBMS, as there are few other data management options that can
cope with relational or hierarchical or object oriented data.

I use a number of different RDBMS (ranging from MS SQL to PostgreSQL and
MySQL).  I also use Excel on occasion, and plain text editors (like Emacs),
to create CSV files.  Which I use depends on the details of the particular
problem I am facing.

While I have not yet explored them, I did notice that R includes a number of
facilities for editing data (and the list of options is all the longer when
I use help.search(edit).

It may be a bit quicker for you to study up on basic use of something like
PostgreSQL, combined with pl/r (something I wish MySQL had), than it would
be to diligently examine all the different options open to you using R.  (I
have a couple books I could recommend that would likely be sufficient for
you to figure out what you need to do with either PostgreSQL or MySQL in a
matter of a week or two).

HTH

Ted


stephen sefick wrote:
 
 I am wondering if there is a better alternative than Excel for data
 storage that does not require database knowledge (I will eventually
 have to learn this, but it is not on my immediate todo list).  I need
 something that is not limited to 256 columns... I don't need any of
 the built in functions in excel just a spreadsheet like program with
 cells that hold data in a data.frame format for a staging area before
 I get it into R.  Any help would be greatly appreciated.  This is not
 a direct r question, but all of you folks have more experience than I
 do and I am having a time finding what I need with google.
 thanks in advance
 
 -- 
 Stephen Sefick
 Research Scientist
 Southeastern Natural Sciences Academy
 
 Let's not spend our time and resources thinking about things that are
 so little or so large that all they really do for us is puff us up and
 make us feel like gods.  We are mammals, and have not exhausted the
 annoying little problems of being mammals.
 
   -K. Mullis
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Staging-area-for-data-before-read-into-R-tp20075962p20078353.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Staging area for data before read into R

2008-10-20 Thread Carl Witthoft
Why not just write your data to a CSV (comma-spaced-variable) or a 
tab-spaced variable   text file?
You didn't say what software and/or hardware was generating your data, 
but most gizmos these days let you dump data to CSV.


No need for Excel at all.
I forget :-( how many rows/columns OpenOffice.org or KOffice can handle.

Carl

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Staging area for data before read into R

2008-10-20 Thread Jim Porzak
Hi Stephen,

You don't say what staging is - do you mean for data entry or loading a
data file for review, or ... ?

In general, I keep away from Excel for data transfer purposes. It tends to
make intelligent decisions on data types leading to strange  bizarre
results (unless you explicitly type each column - which most users don't
do). Integers are interpreted as dates, high order zeros are stripped off of
ZIP codes, and the like.

HTH,
Jim Porzak
TGN.com
San Francisco, CA
http://www.linkedin.com/in/jimporzak
useR Group SF: http://ia.meetup.com/67/


On Mon, Oct 20, 2008 at 11:27 AM, stephen sefick [EMAIL PROTECTED] wrote:

 I am wondering if there is a better alternative than Excel for data
 storage that does not require database knowledge (I will eventually
 have to learn this, but it is not on my immediate todo list).  I need
 something that is not limited to 256 columns... I don't need any of
 the built in functions in excel just a spreadsheet like program with
 cells that hold data in a data.frame format for a staging area before
 I get it into R.  Any help would be greatly appreciated.  This is not
 a direct r question, but all of you folks have more experience than I
 do and I am having a time finding what I need with google.
 thanks in advance

 --
 Stephen Sefick
 Research Scientist
 Southeastern Natural Sciences Academy

 Let's not spend our time and resources thinking about things that are
 so little or so large that all they really do for us is puff us up and
 make us feel like gods.  We are mammals, and have not exhausted the
 annoying little problems of being mammals.

-K. Mullis

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Staging area for data before read into R

2008-10-20 Thread Gabor Grothendieck
There is a list of free spreadsheets with their row and column limits
at this link:
http://en.wikipedia.org/wiki/OpenOffice.org_Calc

On Mon, Oct 20, 2008 at 3:13 PM, stephen sefick [EMAIL PROTECTED] wrote:
 sorry excel 2003 with no immediate update in the future.

 On Mon, Oct 20, 2008 at 3:12 PM, Gabor Grothendieck
 [EMAIL PROTECTED] wrote:
 You didn't say which version of Excel you are using but Excel 2007
 allows 16,384 columns.

 On Mon, Oct 20, 2008 at 2:27 PM, stephen sefick [EMAIL PROTECTED] wrote:
 I am wondering if there is a better alternative than Excel for data
 storage that does not require database knowledge (I will eventually
 have to learn this, but it is not on my immediate todo list).  I need
 something that is not limited to 256 columns... I don't need any of
 the built in functions in excel just a spreadsheet like program with
 cells that hold data in a data.frame format for a staging area before
 I get it into R.  Any help would be greatly appreciated.  This is not
 a direct r question, but all of you folks have more experience than I
 do and I am having a time finding what I need with google.
 thanks in advance

 --
 Stephen Sefick
 Research Scientist
 Southeastern Natural Sciences Academy

 Let's not spend our time and resources thinking about things that are
 so little or so large that all they really do for us is puff us up and
 make us feel like gods.  We are mammals, and have not exhausted the
 annoying little problems of being mammals.

-K. Mullis

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





 --
 Stephen Sefick
 Research Scientist
 Southeastern Natural Sciences Academy

 Let's not spend our time and resources thinking about things that are
 so little or so large that all they really do for us is puff us up and
 make us feel like gods.  We are mammals, and have not exhausted the
 annoying little problems of being mammals.

-K. Mullis


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Staging area for data before read into R

2008-10-20 Thread stephen sefick
Well, I am going to type in ever value because the data sheets are of
counts of insects that I identified, so I should be okay with
accuracy...  I really just need something that allows for more than
256 columns as I have encounter over 256 species of insects in even
small streams.  I think calc with it's 1000ish columns will do the
trick... thanks everbody for your help.

On Mon, Oct 20, 2008 at 8:25 PM, Gabor Grothendieck
[EMAIL PROTECTED] wrote:
 There is a list of free spreadsheets with their row and column limits
 at this link:
 http://en.wikipedia.org/wiki/OpenOffice.org_Calc

 On Mon, Oct 20, 2008 at 3:13 PM, stephen sefick [EMAIL PROTECTED] wrote:
 sorry excel 2003 with no immediate update in the future.

 On Mon, Oct 20, 2008 at 3:12 PM, Gabor Grothendieck
 [EMAIL PROTECTED] wrote:
 You didn't say which version of Excel you are using but Excel 2007
 allows 16,384 columns.

 On Mon, Oct 20, 2008 at 2:27 PM, stephen sefick [EMAIL PROTECTED] wrote:
 I am wondering if there is a better alternative than Excel for data
 storage that does not require database knowledge (I will eventually
 have to learn this, but it is not on my immediate todo list).  I need
 something that is not limited to 256 columns... I don't need any of
 the built in functions in excel just a spreadsheet like program with
 cells that hold data in a data.frame format for a staging area before
 I get it into R.  Any help would be greatly appreciated.  This is not
 a direct r question, but all of you folks have more experience than I
 do and I am having a time finding what I need with google.
 thanks in advance

 --
 Stephen Sefick
 Research Scientist
 Southeastern Natural Sciences Academy

 Let's not spend our time and resources thinking about things that are
 so little or so large that all they really do for us is puff us up and
 make us feel like gods.  We are mammals, and have not exhausted the
 annoying little problems of being mammals.

-K. Mullis

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





 --
 Stephen Sefick
 Research Scientist
 Southeastern Natural Sciences Academy

 Let's not spend our time and resources thinking about things that are
 so little or so large that all they really do for us is puff us up and
 make us feel like gods.  We are mammals, and have not exhausted the
 annoying little problems of being mammals.

-K. Mullis





-- 
Stephen Sefick
Research Scientist
Southeastern Natural Sciences Academy

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods.  We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Staging area for data before read into R

2008-10-20 Thread Christopher W. Ryan
How about simply a text editor, typing in your data, separated by commas
or tabs or spaces?  One row for each case/subject/observation?  R can
read that in easily.

A good, open-source, free data entry program is EpiData.
www.epidata.dk.  It is simple to use but probably more than you need for
task.

--Chris

Christopher W. Ryan, MD
SUNY Upstate Medical University Clinical Campus at Binghamton
40 Arch Street, Johnson City, NY  13790
cryanatbinghamtondotedu
PGP public keys available at http://home.stny.rr.com/ryancw/

If you want to build a ship, don't drum up the men to gather wood,
divide the work and give orders. Instead, teach them to yearn for the
vast and endless sea.  [Antoine de St. Exupery]

stephen sefick wrote:
 I am wondering if there is a better alternative than Excel for data
 storage that does not require database knowledge (I will eventually
 have to learn this, but it is not on my immediate todo list).  I need
 something that is not limited to 256 columns... I don't need any of
 the built in functions in excel just a spreadsheet like program with
 cells that hold data in a data.frame format for a staging area before
 I get it into R.  Any help would be greatly appreciated.  This is not
 a direct r question, but all of you folks have more experience than I
 do and I am having a time finding what I need with google.
 thanks in advance


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Staging area for data before read into R

2008-10-20 Thread Seeliger . Curt
 Well, I am going to type in ever value because the data sheets are of
 counts of insects that I identified, so I should be okay with
 accuracy...  I really just need something that allows for more than
 256 columns as I have encounter over 256 species of insects in even
 small streams. ...

Oh, ugh, that sounds difficult and prone to entry errors.  You might be 
better off organizing in the 'long' format with 4-5 columns:
siteID, subSiteID, species, count, comments

You can then reshape(), or use the reshape package, or even the 'pivot 
table' available in excel and other spreadcheats.

Glad you have an answer.  Enjoy your day.
cur
-- 
Curt Seeliger, Data Ranger
Raytheon Information Services - Contractor to ORD
[EMAIL PROTECTED]
541/754-4638


[EMAIL PROTECTED] wrote on 10/20/2008 06:01:43 PM:

 Well, I am going to type in ever value because the data sheets are of
 counts of insects that I identified, so I should be okay with
 accuracy...  I really just need something that allows for more than
 256 columns as I have encounter over 256 species of insects in even
 small streams.  I think calc with it's 1000ish columns will do the
 trick... thanks everbody for your help.
 
 On Mon, Oct 20, 2008 at 8:25 PM, Gabor Grothendieck
 [EMAIL PROTECTED] wrote:
  There is a list of free spreadsheets with their row and column limits
  at this link:
  http://en.wikipedia.org/wiki/OpenOffice.org_Calc
 
  On Mon, Oct 20, 2008 at 3:13 PM, stephen sefick [EMAIL PROTECTED] 
wrote:
  sorry excel 2003 with no immediate update in the future.
 
  On Mon, Oct 20, 2008 at 3:12 PM, Gabor Grothendieck
  [EMAIL PROTECTED] wrote:
  You didn't say which version of Excel you are using but Excel 2007
  allows 16,384 columns.
 
  On Mon, Oct 20, 2008 at 2:27 PM, stephen sefick [EMAIL PROTECTED] 
wrote:
  I am wondering if there is a better alternative than Excel for data
  storage that does not require database knowledge (I will eventually
  have to learn this, but it is not on my immediate todo list).  I 
need
  something that is not limited to 256 columns... I don't need any of
  the built in functions in excel just a spreadsheet like program 
with
  cells that hold data in a data.frame format for a staging area 
before
  I get it into R.  Any help would be greatly appreciated.  This is 
not
  a direct r question, but all of you folks have more experience than 
I
  do and I am having a time finding what I need with google.
  thanks in advance
 
  --
  Stephen Sefick
  Research Scientist
  Southeastern Natural Sciences Academy
 
  Let's not spend our time and resources thinking about things that 
are
  so little or so large that all they really do for us is puff us up 
and
  make us feel like gods.  We are mammals, and have not exhausted the
  annoying little problems of being mammals.
 
 -K. 
Mullis
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.
 org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 
  --
  Stephen Sefick
  Research Scientist
  Southeastern Natural Sciences Academy
 
  Let's not spend our time and resources thinking about things that are
  so little or so large that all they really do for us is puff us up 
and
  make us feel like gods.  We are mammals, and have not exhausted the
  annoying little problems of being mammals.
 
 -K. 
Mullis
 
 
 
 
 
 -- 
 Stephen Sefick
 Research Scientist
 Southeastern Natural Sciences Academy
 
 Let's not spend our time and resources thinking about things that are
 so little or so large that all they really do for us is puff us up and
 make us feel like gods.  We are mammals, and have not exhausted the
 annoying little problems of being mammals.
 
 -K. Mullis
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.