Re: [R] split a factor into single elements

2024-04-02 Thread Bert Gunter
Note:
> levels(factor(c(0,0,1)))  ## just gives you the levels attribute
[1] "0" "1"
> as.character(factor(c(0,0,1))) ## gives you the level of each value in
the vector
[1] "0" "0" "1"

Does that answer your question or have I misunderstood.

Cheers,
Bert



On Tue, Apr 2, 2024 at 12:00 AM Kimmo Elo  wrote:

> Hi,
>
> why would this simple procedure not work?
>
> --- snip ---
> mydf <- data.frame(id_station = 1234, string_data = c(2024, 12, 1, 0, 0),
> rainfall_value= 55)
>
> mydf$string_data <- as.factor(mydf$string_data)
>
> values<-as.integer(levels(mydf$string_data))
>
> for (i in 1:length(values)) {
> assign(paste("VAR_", i, sep=""), values[i])
> }
>
> --- snip ---
>
> Best,
>
> Kimmo
>
> to, 2024-03-28 kello 14:17 +, Ebert,Timothy Aaron kirjoitti:
> > Here are some pieces of working code. I assume you want the second one or
> > the third one that is functionally the same but all in one statement. I
> > do not understand why it is a factor, but I will assume that there is a
> > current and future reason for that. This means I cannot alter the
> > string_data variable, or you can simplify by not making the variable a
> > factor only to turn it back into character.
> >
> > mydf <- data.frame(id_station = 1234, string_data = c(2024, 12, 1, 0, 0),
> > rainfall_value= 55)
> > mydf$string_data <- as.factor(mydf$string_data)
> >
> > mydf <- data.frame(id_station = 1234, string_data = "2024, 12, 1, 0, 0",
> > rainfall_value= 55)
> > mydf$string_data <- as.factor(mydf$string_data)
> >
> > mydf <- data.frame(id_station = 1234, string_data = as.factor("2024, 12,
> > 1, 0, 0"), rainfall_value= 55)
> >
> > mydf <- data.frame(id_station = 1234, string_data = as.factor("2024, 12,
> > 1, 0, 0"), rainfall_value= 55)
> > mydf$string_data2 <- as.character(mydf$string_data)
> >
> > #I assume there are many records in the data frame and your example is
> > for demonstration only.
> > #I cannot assume that all records are the same, though you may be able to
> > simplify if that is true.
> > #Split the string based on commas.
> > split_values <- strsplit(mydf$string_data2, ",")
> >
> > # find the maximum string length
> > max_length <- max(lengths(split_values))
> >
> > # Add new variables to the data frame
> > for (i in 1:max_length) {
> >   new_var_name <- paste0("VAR_", i)
> >   mydf[[new_var_name]] <- sapply(split_values, function(x)
> > ifelse(length(x) >= i, x[i], NA))
> > }
> >
> > # Convert to numeric
> >  for (i in 1:max_length) {
> >new_var_name <- paste0("VAR_", i)
> >mydf[[new_var_name]] <- as.numeric(mydf[[new_var_name]])
> >  }
> > # remove trash
> > mydf <- mydf[,-4]
> > # Provide more useful names
> > colnames(mydf) <- c("id_station", "string_data", "rainfall_mm", "Year",
> > "Month", "Day", "hour", "minute")
> >
> > Regards,
> > Tim
> >
> > -Original Message-
> > From: R-help  On Behalf Of Stefano Sofia
> > Sent: Thursday, March 28, 2024 7:48 AM
> > To: Fabio D'Agostino ; r-help@R-project.org
> > Subject: Re: [R] split a factor into single elements
> >
> > [External Email]
> >
> > Sorry for my hurry.
> >
> > The correct reproducible code is different from the initial one. The
> > correct example is
> >
> >
> > mydf <- data.frame(id_station = 1234, string_data = as.factor(2024, 12,
> > 1, 0, 0), rainfall_value= 55)
> >
> >
> > In this case mydf$string_data is a factor, but of length 1 (and not 5
> > like in the initial example).
> >
> > Therefore the suggestion offered by Fabio does not work.
> >
> >
> > Any suggestion?
> >
> > Sorry again for my mistake
> >
> > Stefano
> >
> >
> >
> >  (oo)
> > --oOO--( )--OOo--
> > Stefano Sofia PhD
> > Civil Protection - Marche Region - Italy Meteo Section Snow Section Via
> > del Colle Ameno 5
> > 60126 Torrette di Ancona, Ancona (AN)
> > Uff: +39 071 806 7743
> > E-mail: stefano.so...@regione.marche.it
> > ---Oo-oO
> >
> >
> > 
> > Da: Fabio D'Agostino 
> > Inviato: gioved  28 marzo 2024 12:20
> > A: Stefano Sofia; r-help@R-project.org
> > Oggetto: Re: [R] split a factor into single elements
> >
> >
> > Non si ricevono spesso messaggi di posta elettronica da
> > dagostinof...@gmail.com. Informazioni sul perch
> > importante
> >
> > Hi Stefano,
> > maybe something like this can help you?
> >
> > myfactor <- as.factor(c(2024, 2, 1, 0, 0))
> >
> > # Convert factor values to integers
> > first_element <- as.integer(as.character(myfactor)[1])
> > second_element <- as.integer(as.character(myfactor)[2])
> > third_element <- as.integer(as.character(myfactor)[3])
> >
> > # Print the results
> > first_element
> > [1] 2024
> > second_element
> > [1] 2
> > third_element
> > [1] 1
> >
> > # Check the type of the object
> > typeof(first_element)
> > [1] "integer"
> >
> > Fabio
> >
> > Il giorno gio 28 mar 2024 alle ore 11:29 Stefano Sofia
> > mailto:stefano.so...@regione.marche.it
> >>
> 

Re: [R] split a factor into single elements

2024-04-02 Thread Ebert,Timothy Aaron
Using levels rather than length might cause problems. 2024 1, 1, 0, 0 will have 
a different number of levels than 2024, 3, 8, 0, 0 and I cannot assume that the 
two tailing zeros are zero for all records. The code can be simplified if you 
can assume more. It might require more work if I have assumed too much. Maybe 
there is another data set where the string is something like 2, 2, 2024, 0, 0? 
Then you need code to figure out the order of values in the string, reorganize 
it into a common format before trying to merge the data.

The other thing is that values of the string are now different rows. You will 
need a bit more code to reshape mydf from long to wide. If all of the last two 
elements of the string are zero, I would remove these from the data first 
before reshaping.

-Original Message-
From: R-help  On Behalf Of Kimmo Elo
Sent: Tuesday, April 2, 2024 3:00 AM
To: r-help@r-project.org
Subject: Re: [R] split a factor into single elements

[External Email]

Hi,

why would this simple procedure not work?

--- snip ---
mydf <- data.frame(id_station = 1234, string_data = c(2024, 12, 1, 0, 0), 
rainfall_value= 55)

mydf$string_data <- as.factor(mydf$string_data)

values<-as.integer(levels(mydf$string_data))

for (i in 1:length(values)) {
assign(paste("VAR_", i, sep=""), values[i]) }

--- snip ---

Best,

Kimmo

to, 2024-03-28 kello 14:17 +, Ebert,Timothy Aaron kirjoitti:
> Here are some pieces of working code. I assume you want the second one
> or the third one that is functionally the same but all in one
> statement. I do not understand why it is a factor, but I will assume
> that there is a current and future reason for that. This means I
> cannot alter the string_data variable, or you can simplify by not
> making the variable a factor only to turn it back into character.
>
> mydf <- data.frame(id_station = 1234, string_data = c(2024, 12, 1, 0,
> 0), rainfall_value= 55) mydf$string_data <-
> as.factor(mydf$string_data)
>
> mydf <- data.frame(id_station = 1234, string_data = "2024, 12, 1, 0,
> 0", rainfall_value= 55) mydf$string_data <-
> as.factor(mydf$string_data)
>
> mydf <- data.frame(id_station = 1234, string_data = as.factor("2024,
> 12, 1, 0, 0"), rainfall_value= 55)
>
> mydf <- data.frame(id_station = 1234, string_data = as.factor("2024,
> 12, 1, 0, 0"), rainfall_value= 55)
> mydf$string_data2 <- as.character(mydf$string_data)
>
> #I assume there are many records in the data frame and your example is
> for demonstration only.
> #I cannot assume that all records are the same, though you may be able
> to simplify if that is true.
> #Split the string based on commas.
> split_values <- strsplit(mydf$string_data2, ",")
>
> # find the maximum string length
> max_length <- max(lengths(split_values))
>
> # Add new variables to the data frame
> for (i in 1:max_length) {
>   new_var_name <- paste0("VAR_", i)
>   mydf[[new_var_name]] <- sapply(split_values, function(x)
> ifelse(length(x) >= i, x[i], NA))
> }
>
> # Convert to numeric
>  for (i in 1:max_length) {
>new_var_name <- paste0("VAR_", i)
>mydf[[new_var_name]] <- as.numeric(mydf[[new_var_name]])  } #
> remove trash mydf <- mydf[,-4] # Provide more useful names
> colnames(mydf) <- c("id_station", "string_data", "rainfall_mm",
> "Year", "Month", "Day", "hour", "minute")
>
> Regards,
> Tim
>
> -Original Message-
> From: R-help  On Behalf Of Stefano Sofia
> Sent: Thursday, March 28, 2024 7:48 AM
> To: Fabio D'Agostino ; r-help@R-project.org
> Subject: Re: [R] split a factor into single elements
>
> [External Email]
>
> Sorry for my hurry.
>
> The correct reproducible code is different from the initial one. The
> correct example is
>
>
> mydf <- data.frame(id_station = 1234, string_data = as.factor(2024,
> 12, 1, 0, 0), rainfall_value= 55)
>
>
> In this case mydf$string_data is a factor, but of length 1 (and not 5
> like in the initial example).
>
> Therefore the suggestion offered by Fabio does not work.
>
>
> Any suggestion?
>
> Sorry again for my mistake
>
> Stefano
>
>
>
>  (oo)
> --oOO--( )--OOo--
> Stefano Sofia PhD
> Civil Protection - Marche Region - Italy Meteo Section Snow Section
> Via del Colle Ameno 5
> 60126 Torrette di Ancona, Ancona (AN)
> Uff: +39 071 806 7743
> E-mail: stefano.so...@regione.marche.it
> ---Oo-oO
>
>
> 
> Da: Fabio D'Agostino 
> Inviato: gioved  28 marzo 2024 12:20
> A: Stefano Sofia; r-help@R-project.org
> Oggetto: Re: [R] split a factor into single elements
>
>
> Non si ricevono spesso messaggi di posta elettronica da
> dagostinof...@gmail.com. Informazioni sul perch
> importante
>
> Hi Stefano,
> maybe something like this can help you?
>
> myfactor <- as.factor(c(2024, 2, 1, 0, 0))
>
> # Convert factor values to integers
> first_element <- as.integer(as.character(myfactor)[1])
> second_element <- 

Re: [R] How to tweak genomic plot with genoPlotR?

2024-04-02 Thread Luigi Marongiu
Already did...

On Tue, Apr 2, 2024 at 10:45 AM Eric Berger  wrote:
>
> According to https://cran.r-project.org/web/packages/genoPlotR/index.html
> the maintainer of genoPlotR is
>
> Lionel Guy 
>
> Send your question also to him.
>
> On Tue, Apr 2, 2024 at 11:27 AM Luigi Marongiu  
> wrote:
> >
> > I would like to use your genoPlotR package
> > (doi:10.1093/bioinformatics/btq413) to compare the genomes of two
> > isolates of E. coli K-12 that I have. One is a K-12 that was in my
> > lab's fridge; the other is a derivative of K-12 bought some time ago,
> > HB101.
> > I tried to use genoPlotR, but I could not understand some functions
> > from your vignette. I would like to ask you whether you could help me
> > with this.
> >
> > I aligned the genomes (reference K-12 plus my isolates) with
> > `progressiveMauve --weight=15 --output=./K12_Aln.fa K12_multi.fa`,
> > where K12_multi.fa contains the fasta sequences of the reference and
> > the consensuses I obtained from my isolates after Illumina NGS. I then
> > ran this script:
> >
> > ```
> > ## get data
> > bbone_file = "./K12_Aln.backbone"
> > bbone = read_mauve_backbone(bbone_file, ref=2)
> > names(bbone$dna_segs) = c("K-12 ref.", "K-12 Ho", "HB101 Ho")
> >
> > ## calculate lengths
> > for (i in 1:length(bbone$comparisons)) {
> >   cmp = bbone$comparisons[[i]]
> >   bbone$comparisons[[i]]$length = abs(cmp$end1 - cmp$end1) +
> >   abs(cmp$end2 - cmp$end2)
> > }
> >
> > ## plot
> > plot_gene_map(dna_segs = bbone$dna_segs,
> >   comparisons = bbone$comparisons,
> >   global_color_scheme = c("length", "increasing", "red_blue", 
> > 0.7),
> >   override_color_schemes = TRUE)
> > ```
> > I got the following plot: https://u.cubeupload.com/Gigiux/Rplot.png
> > My questions are:
> > - How can I load the annotations? I have the K-12 annotations in gff3
> > and genebank formats, but how do I load them in the system so that I
> > plot it here?
> > - Is it possible to zoom in?
> > - Is it possible to change the color scheme?
> > Thank you
> >
> >
> >
> >
> > --
> > Best regards,
> > Luigi
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.



-- 
Best regards,
Luigi

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to tweak genomic plot with genoPlotR?

2024-04-02 Thread Eric Berger
According to https://cran.r-project.org/web/packages/genoPlotR/index.html
the maintainer of genoPlotR is

Lionel Guy 

Send your question also to him.

On Tue, Apr 2, 2024 at 11:27 AM Luigi Marongiu  wrote:
>
> I would like to use your genoPlotR package
> (doi:10.1093/bioinformatics/btq413) to compare the genomes of two
> isolates of E. coli K-12 that I have. One is a K-12 that was in my
> lab's fridge; the other is a derivative of K-12 bought some time ago,
> HB101.
> I tried to use genoPlotR, but I could not understand some functions
> from your vignette. I would like to ask you whether you could help me
> with this.
>
> I aligned the genomes (reference K-12 plus my isolates) with
> `progressiveMauve --weight=15 --output=./K12_Aln.fa K12_multi.fa`,
> where K12_multi.fa contains the fasta sequences of the reference and
> the consensuses I obtained from my isolates after Illumina NGS. I then
> ran this script:
>
> ```
> ## get data
> bbone_file = "./K12_Aln.backbone"
> bbone = read_mauve_backbone(bbone_file, ref=2)
> names(bbone$dna_segs) = c("K-12 ref.", "K-12 Ho", "HB101 Ho")
>
> ## calculate lengths
> for (i in 1:length(bbone$comparisons)) {
>   cmp = bbone$comparisons[[i]]
>   bbone$comparisons[[i]]$length = abs(cmp$end1 - cmp$end1) +
>   abs(cmp$end2 - cmp$end2)
> }
>
> ## plot
> plot_gene_map(dna_segs = bbone$dna_segs,
>   comparisons = bbone$comparisons,
>   global_color_scheme = c("length", "increasing", "red_blue", 
> 0.7),
>   override_color_schemes = TRUE)
> ```
> I got the following plot: https://u.cubeupload.com/Gigiux/Rplot.png
> My questions are:
> - How can I load the annotations? I have the K-12 annotations in gff3
> and genebank formats, but how do I load them in the system so that I
> plot it here?
> - Is it possible to zoom in?
> - Is it possible to change the color scheme?
> Thank you
>
>
>
>
> --
> Best regards,
> Luigi
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to tweak genomic plot with genoPlotR?

2024-04-02 Thread Luigi Marongiu
I would like to use your genoPlotR package
(doi:10.1093/bioinformatics/btq413) to compare the genomes of two
isolates of E. coli K-12 that I have. One is a K-12 that was in my
lab's fridge; the other is a derivative of K-12 bought some time ago,
HB101.
I tried to use genoPlotR, but I could not understand some functions
from your vignette. I would like to ask you whether you could help me
with this.

I aligned the genomes (reference K-12 plus my isolates) with
`progressiveMauve --weight=15 --output=./K12_Aln.fa K12_multi.fa`,
where K12_multi.fa contains the fasta sequences of the reference and
the consensuses I obtained from my isolates after Illumina NGS. I then
ran this script:

```
## get data
bbone_file = "./K12_Aln.backbone"
bbone = read_mauve_backbone(bbone_file, ref=2)
names(bbone$dna_segs) = c("K-12 ref.", "K-12 Ho", "HB101 Ho")

## calculate lengths
for (i in 1:length(bbone$comparisons)) {
  cmp = bbone$comparisons[[i]]
  bbone$comparisons[[i]]$length = abs(cmp$end1 - cmp$end1) +
  abs(cmp$end2 - cmp$end2)
}

## plot
plot_gene_map(dna_segs = bbone$dna_segs,
  comparisons = bbone$comparisons,
  global_color_scheme = c("length", "increasing", "red_blue", 0.7),
  override_color_schemes = TRUE)
```
I got the following plot: https://u.cubeupload.com/Gigiux/Rplot.png
My questions are:
- How can I load the annotations? I have the K-12 annotations in gff3
and genebank formats, but how do I load them in the system so that I
plot it here?
- Is it possible to zoom in?
- Is it possible to change the color scheme?
Thank you




--
Best regards,
Luigi

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] split a factor into single elements

2024-04-02 Thread Kimmo Elo
Hi,

why would this simple procedure not work?

--- snip ---
mydf <- data.frame(id_station = 1234, string_data = c(2024, 12, 1, 0, 0),
rainfall_value= 55)

mydf$string_data <- as.factor(mydf$string_data)

values<-as.integer(levels(mydf$string_data))

for (i in 1:length(values)) { 
assign(paste("VAR_", i, sep=""), values[i]) 
}

--- snip ---

Best,

Kimmo

to, 2024-03-28 kello 14:17 +, Ebert,Timothy Aaron kirjoitti:
> Here are some pieces of working code. I assume you want the second one or
> the third one that is functionally the same but all in one statement. I
> do not understand why it is a factor, but I will assume that there is a
> current and future reason for that. This means I cannot alter the
> string_data variable, or you can simplify by not making the variable a
> factor only to turn it back into character.
> 
> mydf <- data.frame(id_station = 1234, string_data = c(2024, 12, 1, 0, 0),
> rainfall_value= 55)
> mydf$string_data <- as.factor(mydf$string_data)
> 
> mydf <- data.frame(id_station = 1234, string_data = "2024, 12, 1, 0, 0",
> rainfall_value= 55)
> mydf$string_data <- as.factor(mydf$string_data)
> 
> mydf <- data.frame(id_station = 1234, string_data = as.factor("2024, 12,
> 1, 0, 0"), rainfall_value= 55)
> 
> mydf <- data.frame(id_station = 1234, string_data = as.factor("2024, 12,
> 1, 0, 0"), rainfall_value= 55)
> mydf$string_data2 <- as.character(mydf$string_data)
> 
> #I assume there are many records in the data frame and your example is
> for demonstration only.
> #I cannot assume that all records are the same, though you may be able to
> simplify if that is true.
> #Split the string based on commas.
> split_values <- strsplit(mydf$string_data2, ",")
> 
> # find the maximum string length
> max_length <- max(lengths(split_values))
> 
> # Add new variables to the data frame
> for (i in 1:max_length) {
>   new_var_name <- paste0("VAR_", i)
>   mydf[[new_var_name]] <- sapply(split_values, function(x)
> ifelse(length(x) >= i, x[i], NA))
> }
> 
> # Convert to numeric
>  for (i in 1:max_length) {
>    new_var_name <- paste0("VAR_", i)
>    mydf[[new_var_name]] <- as.numeric(mydf[[new_var_name]])
>  }
> # remove trash
> mydf <- mydf[,-4]
> # Provide more useful names
> colnames(mydf) <- c("id_station", "string_data", "rainfall_mm", "Year",
> "Month", "Day", "hour", "minute")
> 
> Regards,
> Tim
> 
> -Original Message-
> From: R-help  On Behalf Of Stefano Sofia
> Sent: Thursday, March 28, 2024 7:48 AM
> To: Fabio D'Agostino ; r-help@R-project.org
> Subject: Re: [R] split a factor into single elements
> 
> [External Email]
> 
> Sorry for my hurry.
> 
> The correct reproducible code is different from the initial one. The
> correct example is
> 
> 
> mydf <- data.frame(id_station = 1234, string_data = as.factor(2024, 12,
> 1, 0, 0), rainfall_value= 55)
> 
> 
> In this case mydf$string_data is a factor, but of length 1 (and not 5
> like in the initial example).
> 
> Therefore the suggestion offered by Fabio does not work.
> 
> 
> Any suggestion?
> 
> Sorry again for my mistake
> 
> Stefano
> 
> 
> 
>  (oo)
> --oOO--( )--OOo--
> Stefano Sofia PhD
> Civil Protection - Marche Region - Italy Meteo Section Snow Section Via
> del Colle Ameno 5
> 60126 Torrette di Ancona, Ancona (AN)
> Uff: +39 071 806 7743
> E-mail: stefano.so...@regione.marche.it
> ---Oo-oO
> 
> 
> 
> Da: Fabio D'Agostino 
> Inviato: gioved  28 marzo 2024 12:20
> A: Stefano Sofia; r-help@R-project.org
> Oggetto: Re: [R] split a factor into single elements
> 
> 
> Non si ricevono spesso messaggi di posta elettronica da
> dagostinof...@gmail.com. Informazioni sul perch   
> importante
> 
> Hi Stefano,
> maybe something like this can help you?
> 
> myfactor <- as.factor(c(2024, 2, 1, 0, 0))
> 
> # Convert factor values to integers
> first_element <- as.integer(as.character(myfactor)[1])
> second_element <- as.integer(as.character(myfactor)[2])
> third_element <- as.integer(as.character(myfactor)[3])
> 
> # Print the results
> first_element
> [1] 2024
> second_element
> [1] 2
> third_element
> [1] 1
> 
> # Check the type of the object
> typeof(first_element)
> [1] "integer"
> 
> Fabio
> 
> Il giorno gio 28 mar 2024 alle ore 11:29 Stefano Sofia
> mailto:stefano.so...@regione.marche.it>>
> ha scritto:
> Dear R-list users,
> 
> forgive me for this silly question, I did my best to find a solution with
> no success.
> 
> Suppose I have a factor type like
> 
> 
> myfactor <- as.factor(2024, 2, 1, 0, 0)
> 
> 
> There are no characters (and therefore strsplit for eample does not
> work).
> 
> I need to store separately the 1st, 2nd and 3rd elements as integers. How
> can I do?
> 
> 
> Thank you for your help
> 
> Stefano
> 
> 
>  (oo)
> --oOO--( )--OOo--
> Stefano Sofia PhD
> Civil Protection - Marche Region - Italy