Let us take the max space is two and the output should not be fixed
filed but preferable a csv file.

On Mon, Feb 22, 2021 at 8:05 PM jim holtman <jholt...@gmail.com> wrote:
>
> Messed up did not see your 'desired' output which will be hard since there is 
> not a consistent number of spaces that would represent the desired column 
> number.  Do you have any hit as to how to interpret the spacing especially 
> you have several hundred more lines?  Is the output supposed to the 'fixed' 
> field?
>
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
>
>
> On Mon, Feb 22, 2021 at 5:00 PM jim holtman <jholt...@gmail.com> wrote:
>>
>> Try this:
>>
>> > library(tidyverse)
>>
>> > text <-  "x1  x2  x3 x4\n1 B12 \n2       C23 \n322 B32      D34 \n4        
>> >     D44 \n51     D53\n60 D62         "
>>
>> > # read in the data as characters and replace multiple blanks with single 
>> > blank
>> > input <- read_lines(text)
>>
>> > input <- str_replace_all(input, ' +', ' ')
>>
>> > mydata <- read_delim(input, ' ', col_names = TRUE)
>> Warning: 5 parsing failures.
>> row col  expected    actual         file
>>   1  -- 4 columns 3 columns literal data
>>   2  -- 4 columns 3 columns literal data
>>   4  -- 4 columns 3 columns literal data
>>   5  -- 4 columns 2 columns literal data
>>   6  -- 4 columns 3 columns literal data
>>
>> > mydata
>> # A tibble: 6 x 4
>>      x1 x2    x3    x4
>>   <dbl> <chr> <chr> <lgl>
>> 1     1 B12   NA    NA
>> 2     2 C23   NA    NA
>> 3   322 B32   D34   NA
>> 4     4 D44   NA    NA
>> 5    51 D53   NA    NA
>> 6    60 D62   NA    NA
>> >
>>
>> Jim Holtman
>> Data Munger Guru
>>
>> What is the problem that you are trying to solve?
>> Tell me what you want to do, not how you want to do it.
>>
>>
>> Jim Holtman
>> Data Munger Guru
>>
>> What is the problem that you are trying to solve?
>> Tell me what you want to do, not how you want to do it.
>>
>>
>> On Mon, Feb 22, 2021 at 4:49 PM Val <valkr...@gmail.com> wrote:
>>>
>>> That is my problem. The spacing between columns is not consistent.  It
>>>   may be  single space  or multiple spaces (two or three).
>>>
>>> On Mon, Feb 22, 2021 at 6:14 PM Bill Dunlap <williamwdun...@gmail.com> 
>>> wrote:
>>> >
>>> > You said the column values were separated by space characters.
>>> > Copying the text from gmail shows that some column names and column
>>> > values are separated by single spaces (e.g., between x1 and x2) and
>>> > some by multiple spaces (e.g., between x3 and x4.  Did the mail mess
>>> > up the spacing or is there some other way to tell where the omitted
>>> > values are?
>>> >
>>> > -Bill
>>> >
>>> > On Mon, Feb 22, 2021 at 2:54 PM Val <valkr...@gmail.com> wrote:
>>> > >
>>> > > I Tried that one and it did not work. Please see the error message
>>> > > Error in read.table(text = "x1  x2  x3 x4\n1 B12 \n2       C23
>>> > > \n322 B32      D34 \n4            D44 \n51     D53\n60 D62         ",
>>> > > :
>>> > >   more columns than column names
>>> > >
>>> > > On Mon, Feb 22, 2021 at 5:39 PM Bill Dunlap <williamwdun...@gmail.com> 
>>> > > wrote:
>>> > > >
>>> > > > Since the columns in the file are separated by a space character, " ",
>>> > > > add the read.table argument sep=" ".
>>> > > >
>>> > > > -Bill
>>> > > >
>>> > > > On Mon, Feb 22, 2021 at 2:21 PM Val <valkr...@gmail.com> wrote:
>>> > > > >
>>> > > > > Hi all, I am trying to read a messy data  but facing  difficulty.  
>>> > > > > The
>>> > > > > data has several columns separated by blank space(s).  Each column
>>> > > > > value may have different lengths across the rows.   The first
>>> > > > > row(header) has four columns. However, each row may not have the 
>>> > > > > four
>>> > > > > column values.  For instance, the first data row has only the first
>>> > > > > two column values. The fourth data row has the first and last column
>>> > > > > values, the second and the third column values are missing for this
>>> > > > > row..  How do I read this data set correctly? Here is my sample data
>>> > > > > set, output and desired output.   To make it clear to each data 
>>> > > > > point
>>> > > > > I have added the row and column numbers. I cannot use fixed width
>>> > > > > format reading because each row  may have different length for  a
>>> > > > > given column.
>>> > > > >
>>> > > > > dat<-read.table(text="x1  x2  x3 x4
>>> > > > > 1 B22
>>> > > > > 2         C33
>>> > > > > 322 B22      D34
>>> > > > > 4                 D44
>>> > > > > 51         D53
>>> > > > > 60 D62            ",header=T, fill=T,na.strings=c("","NA"))
>>> > > > >
>>> > > > > Output
>>> > > > >       x1  x2     x3     x4
>>> > > > > 1   1     B12 <NA> NA
>>> > > > > 2   2    C23 <NA>  NA
>>> > > > > 3 322  B32  D34   NA
>>> > > > > 4   4   D44  <NA>  NA
>>> > > > > 5  51 D53  <NA>   NA
>>> > > > > 6  60 D62  <NA>  NA
>>> > > > >
>>> > > > >
>>> > > > > Desired output
>>> > > > >    x1   x2     x3       x4
>>> > > > > 1   1    B22    <NA>   NA
>>> > > > > 2   2   <NA>  C33     NA
>>> > > > > 3 322  B32    NA      D34
>>> > > > > 4   4   <NA>   NA      D44
>>> > > > > 5  51  <NA>  D53     NA
>>> > > > > 6  60   D62   <NA>   NA
>>> > > > >
>>> > > > > Thank you,
>>> > > > >
>>> > > > > ______________________________________________
>>> > > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> > > > > https://stat.ethz.ch/mailman/listinfo/r-help
>>> > > > > PLEASE do read the posting guide 
>>> > > > > http://www.R-project.org/posting-guide.html
>>> > > > > and provide commented, minimal, self-contained, reproducible code.
>>>
>>> ______________________________________________
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to