[R] Approximate taylor series

2016-04-27 Thread Anindya Sankar Dey
Hi All,

Say I have the values of function f(x1,x2,x3,x4) for each values of
x1,x2,x3,x4 but not complete. But the functional form is not known.

Techniques like regression, etc. are not able to give me satisfactory
results and msy be more complex than we thought.

I wanted to use Taylor's approximation to continuous function, to
approximate a functional form using the given data. But failed to see a
package in R thaat does that.

Can anyone suggest a way to do it?

-- 
Anindya Sankar Dey

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Probable Error in fmsb package

2015-12-29 Thread Anindya Sankar Dey
Hi All,

The fmsb package has a function called Variance Inflation Factor and it
states the definition of the function as follows:-

"To evaluate multicolinearity of multiple regression model, calculating the
variance inflation factor (VIF) from the result of lm(). If VIF is more
than 10, multicolinearity is strongly suggested.
"

​The function computes VIF of a model as 1/(1-R^2) where R^2 is the
coefficient of determination.

Now nowhere in literature I have come across this definition of VIF, as VIF
is always computed at individual variable level. Though the structure is
almost the same, R^2 in theoretical VIF is the partial correlation
coefficient.

​I only came aware when lots of freshers from non statistics background I
interviewed for analytics position answered that the only definition of VIF
they know is 1/(1 - Coeff. of Determination), and there is a R package
which calculates VIF like that.

After researched I found that such a function indeed exist in fmsb package.

Please help me understand has an alternate definition of Variance Inflation
Factor has ever emerged in theory? Does it really make sense to have VIF at
a model level, as it does not help in solving the problem of
multicollinearity during model building.

And if I am right, what steps I should do about it.


-- 
Anindya Sankar Dey

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Text Mining - Remove punctuation not removing quotes and dashes

2015-06-07 Thread Anindya Sankar Dey
Hi,

I have been doing some text mining. I created the DTM matrix using the
following steps.

corpus1<-VCorpus(VectorSource(resume1$Dat1))

corpus1<-tm_map(corpus1,content_transformer(tolower))

dtm<-DocumentTermMatrix(corpus1,
   control = list(removePunctuation = TRUE,
  removeNumbers = TRUE,
  removeSparseTerms=TRUE,
stopwords = TRUE))


​After all the run I am still getting words like -quotation, "fun, model"​
, etc.

What can I do about it. I do not need this dahses and extra quotations.

-- 
Anindya Sankar Dey

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] plm returning less number of variable than provided as input

2015-04-13 Thread Anindya Sankar Dey
Hi,

I was trying a plm  model with around 400 variables, but after passing that
 to the plm function I am getting coefficients for 265 variables.

Can anyone explain me the reason? Is there a size restriction in plm?

-- 
Anindya Sankar Dey

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Data transformation to list for event occurence

2013-11-12 Thread Anindya Sankar Dey
Hi,

Say I have a following data

ID   WeekEvent_Occurence
A 1 0
A 2 0
A 3 1
A 4 0
B 1 1
B 2 0
B 3 0
B 4 1

that whether an individual experienced an event in a particular week.

I wish to create list such as the first element of the list will be a
vector listing the week number when the event has occurred for A, followed
by that of B.

Can you help creating this?

-- 
Anindya Sankar Dey

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (no subject)

2013-08-23 Thread Anindya Sankar Dey
You can easily subset the data then use rowSum.

say your dataset name is data1.

then write data2<-data[,c(7,12,45,57)]

then write result<-rowsum(data2)


On Fri, Aug 23, 2013 at 3:47 PM, rajib prasad  wrote:

> I am new to R. I have a data like:
>
>   x  y   z  w   p  ..
>  m
>1 1015 20 25 30
>2 11 1621 26 31
>3  12 171819 20
>4  51 52535567
>  ...
>
> thus I have 145 rows and 160 column in my data which is named as
> data.csv.  Now i want to create a new column 'm' and for every row m
> will take value =column 7+ column 12+ column 57+ column 45 i.e. for
> every row it will take value of sum of corresponding row's 7 & 12 & 57
> & 45 column's value .
> So, how to write the code for this operation?
>
>
>
>
>
> --
>
>
>
> RAJIB PRASAD
>
> Centre for Economic Studies & Planning
> Jawaharlal Nehru University
> New Delhi-67
> contact no: 09868320368
> mail id: rwho2...@gmail.com
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Anindya Sankar Dey

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combining two tables without going through lot of ifelse statement

2013-08-23 Thread Anindya Sankar Dey
Hi All,

Arun's solution is working.

Now can someone help me in just an expansion.


If we have multiple table like this, adding them in rbind is working, but
if I want a generic function where we do not know how many tables will be
created can that also be avoided from using loops.


On Fri, Aug 23, 2013 at 7:15 PM, arun  wrote:

>
>
> In the case of ?data.table()
>
> dt1<-data.table(rbind(as.matrix(dat1),as.matrix(dat2))) ## converted the
> data.frame to matrix to mimic the situation
>
>  dt2<- subset(dt1[,sum(V2),by=V1],V1!=0)
>  setnames(dt2,2,"V2")
>  dt2
> #   V1 V2
> #1:  1 10
> #2:  3 10
> #3:  2 10
>
>
> #or
>
>
>  
> res<-with(as.data.frame(rbind(as.matrix(dat1),as.matrix(dat2))),aggregate(V2~V1,FUN=sum))
>  res1<- res[res[,1]!=0,]
>  res1
> #  V1 V2
> #2  1 10
> #3  2 10
> #4  3 10
> A.K.
> 
> From: Anindya Sankar Dey 
> To: arun 
> Sent: Friday, August 23, 2013 9:40 AM
> Subject: Re: [R] Combining two tables without going through lot of ifelse
> statement
>
>
>
> Mine is matrices, will this work on matrices as well?
>
> Thank for your help
>
>
>
> On Fri, Aug 23, 2013 at 7:02 PM, arun  wrote:
>
> However it is not clear when you mention these are tables.  There is
> ?table() and ?data.frame and the structure will be different in each case.
> Here, I assumed that your table is data.frame..
> >
> >
> >
> >
> >- Original Message -
> >From: arun 
> >To: Anindya Sankar Dey 
> >Cc: R help 
> >Sent: Friday, August 23, 2013 9:30 AM
> >Subject: Re: [R] Combining two tables without going through lot of ifelse
>   statement
> >
> >Hi,
> >Try:
> >
> >dat1<- read.table(text="
> >1 10
> >3  5
> >0  0
> >",sep="",header=FALSE)
> >dat2<- read.table(text="
> >2 10
> >0  0
> >3  5
> >",sep="",header=FALSE)
> >res<-with(rbind(dat1,dat2),aggregate(V2~V1,FUN=sum))
> >res1<-res[res[,1]!=0,]
> > res1
> >#  V1 V2
> >#2  1 10
> >#3  2 10
> >#4  3 10
> >
> >#or
> >library(data.table)
> >dt1<- data.table(rbind(dat1,dat2))
> > dt2<-subset(dt1[,sum(V2),by=V1],V1!=0)
> > setnames(dt2,2,"V2")
> > dt2
> >#   V1 V2
> >#1:  1 10
> >#2:  3 10
> >#3:  2 10
> >
> >A.K.
> >
> >- Original Message -
> >From: Anindya Sankar Dey 
> >To: r-help 
> >Cc:
> >Sent: Friday, August 23, 2013 8:59 AM
> >Subject: [R] Combining two tables without going through lot of ifelse
> statement
> >
> >HI All,
> >
> >Suppose I have two table like below
> >
> >Table 1:
> >
> >1 10
> >3  5
> >0  0
> >
> >Table 2:
> >
> >2 10
> >0  0
> >3  5
> >
> >
> >I need to create a new table like below
> >
> >Table 3:
> >
> >1 10
> >2 10
> >3 10
> >
> >The row may interchange in table 3, but is there any way to do this
> instead
> >of writing lot of if-else and loops?
> >
> >Thanks in advance.
> >
> >--
> >Anindya Sankar Dey
> >
> >[[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
> >
> >
>
>
> --
> Anindya Sankar Dey
>



-- 
Anindya Sankar Dey

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Combining two tables without going through lot of ifelse statement

2013-08-23 Thread Anindya Sankar Dey
HI All,

Suppose I have two table like below

Table 1:

1 10
3  5
0  0

Table 2:

2 10
0  0
3  5


I need to create a new table like below

Table 3:

1 10
2 10
3 10

The row may interchange in table 3, but is there any way to do this instead
of writing lot of if-else and loops?

Thanks in advance.

-- 
Anindya Sankar Dey

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Apriori probabilities in naiveBayes function

2013-08-05 Thread Anindya Sankar Dey
Hi All,

I applied the naiveBayes function in e1071 package with the iris data, and
here's the list that was created

structure(list(apriori = structure(c(50L, 50L, 50L), .Dim = 3L, .Dimnames =
structure(list(
Y = c("setosa", "versicolor", "virginica")), .Names = "Y"), class =
"table"),
tables = structure(list(Sepal.Length = structure(c(5.006,
5.936, 6.588, 0.352489687213451, 0.516171147063863, 0.635879593274432
), .Dim = c(3L, 2L), .Dimnames = structure(list(Y = c("setosa",
"versicolor", "virginica"), Sepal.Length = NULL), .Names = c("Y",
"Sepal.Length"))), Sepal.Width = structure(c(3.428, 2.77,
2.974, 0.379064369096289, 0.313798323378411, 0.322496638172637
), .Dim = c(3L, 2L), .Dimnames = structure(list(Y = c("setosa",
"versicolor", "virginica"), Sepal.Width = NULL), .Names = c("Y",
"Sepal.Width"))), Petal.Length = structure(c(1.462, 4.26,
5.552, 0.173663996480184, 0.469910977239958, 0.551894695663983
), .Dim = c(3L, 2L), .Dimnames = structure(list(Y = c("setosa",
"versicolor", "virginica"), Petal.Length = NULL), .Names = c("Y",
"Petal.Length"))), Petal.Width = structure(c(0.246, 1.326,
2.026, 0.105385589380046, 0.197752680004544, 0.274650055636667
), .Dim = c(3L, 2L), .Dimnames = structure(list(Y = c("setosa",
"versicolor", "virginica"), Petal.Width = NULL), .Names = c("Y",
"Petal.Width", .Names = c("Sepal.Length", "Sepal.Width",
"Petal.Length", "Petal.Width")), levels = c("setosa", "versicolor",
"virginica"), call = quote(naiveBayes.default(x = X, y = Y,
laplace = laplace))), .Names = c("apriori", "tables",
"levels", "call"), class = "naiveBayes")


I'm unable to understand that the first element of the list should be a
vector like (50,50,50) but its correctly showing (0.33,0.33,0.33).

Can anyone tell me which part of the code is doing this?

-- 
Anindya Sankar Dey

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plotting time data for various countries in same graph

2013-03-06 Thread Anindya Sankar Dey
Hi,

I've the following kind of data

Time  Country Values
2010Q1India   5
2010Q2India   7
2010Q3India   5
2010Q4India   9
2010Q1China 10
2010Q2China  6
2010Q3China  9
2010Q4 China 14


I needed to plot a graph with the x-axis being time,y-axis being he Values
and 2 line graph , one for India and one for counry.

I don't have great knowledge on graphics in R.

I was trying to use, ggplot(data,aes(x=Time,y=Values,colour=Country))

But this does not help.

Can anyone help me with this?

-- 
Anindya Sankar Dey

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] improving/speeding up a very large, slow simulation

2013-02-11 Thread Anindya Sankar Dey
.out=length.out)
>
> permut.grid<-expand.grid(number.strata.range, project.n.range,
> project.acreage.range, project.mean, project.sd.range,
> number.verification.plots, verification.range, allowed.deviation) # create
> a matrix with all combinations of the supplied vectors
>
> #assign names to the colums of the grid of combinations
> names.permut<-c("number.strata", "project.n.plots", "project.acreage",
> "project.mean", "project.sd", "number.verification.plots",
> "verification.mean", "allowed.deviation")
>
> names(permut.grid)<-names.permut # done
>
> combinations<-length(permut.grid[,1])
>
> size <-reps*combinations #need to know the size of the master matrix, which
> is the the number of replications * each combination of the supplied
> factors
>
> # we want a df from which to read all the data into the simulation, and
> record the results
> permut.col<-ncol(permut.grid)
> col.base<-ncol(permut.grid)+2
> base <- matrix(nrow=size, ncol=col.base)
> base <-data.frame(base)
>
> # supply the names
> names.base<-c("number.strata", "project.n.plots", "project.acreage",
> "project.mean", "project.sd", "number.verification.plots",
> "verification.mean", "allowed.deviation","success.fail",
> "plots.to.success")
>
> names(base)<-names.base
>
> # need to create index vectors for the base, master df
> ends <- seq(reps+1, size+1, by=reps)
> begins <- ends-reps
> index <- cbind(begins, ends-1)
> #done
>
> # next, need to assign the first 6 columns and number of rows = to the
> number of reps in the simulation to be the given row in the permut.grid
> matrix
>
> pb <- winProgressBar(title="Create base, big loop 1 of 2", label="0% done",
> min=0, max=100, initial=0)
>
> for (i in 1:combinations) {
>
> base[index[i,1]:index[i,2],1:permut.col] <- permut.grid[i,]
> #progress bar
>  info <- sprintf("%d%% done", round((i/combinations)*100))
> setWinProgressBar(pb, (i/combinations)*100, label=info)
> }
>
> close(pb)
>
> # now, simply feed the values replicated the number of times we want to run
> the simulation into the sequential.unpaired function, and assign the values
> to the appropriate columns
>
> out.index1<-ncol(permut.grid)+1
> out.index2<-ncol(permut.grid)+2
>
> #progress bar
> pb <- winProgressBar(title="fill values, big loop 2 of 2", label="0% done",
> min=0, max=100, initial=0)
>
> for (i in 1:size){
>
> scalar.base <- base[i,]
>  verification.plots <- rnorm(scalar.base$number.verification.plots,
> scalar.base$verification.mean, scalar.base$project.sd)
>  result<- sequential.unpaired(scalar.base$number.strata,
> scalar.base$project.n.plots, scalar.base$project.mean, scalar.base$
> project.sd, verification.plots, scalar.base$allowed.deviation,
> scalar.base$project.acreage, min.plots='default', alpha)
>
> base[i,out.index1] <- result[[6]][1]
> base[i,out.index2] <- result[[7]][1]
>  info <- sprintf("%d%% done", round((i/size)*100))
> setWinProgressBar(pb, (i/size)*100, label=info)
> }
>
> close(pb)
> #return the results
> return(base)
>
> }
>
> # I would like reps to = 1000, but that takes a really long time right now
> test5 <- simulation.unpaired(reps=5, project.acreage.range = c(99,
> 110,510,5100,11000), project.mean=100, project.n.min=10, project.n.max=100,
> project.sd.min=.01,  project.sd.max=.2, verification.mean.min=100,
> verification.mean.max=130, number.verification.plots.min=10,
> number.verification.plots.max=50, length.out = 5)
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Anindya Sankar Dey

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unable to work with Rattle

2012-10-14 Thread Anindya Sankar Dey
Hi,

You can download the package here
http://cran.r-project.org/web/packages/XML/index.html

Regards,
Anindya

On Sun, Oct 14, 2012 at 10:14 PM, balaji sarangarajan
wrote:

>
>
>
> Hello,
>
>
>
> I have installed R version 2.15.1 and I am trying to work with
> rattle package.   I have got an error
> stating as below;
>
>
>
> Error in loadTooltips() : could not find function
> "xmlTreeParse"
>
> In addition: Warning messages:
>
> 1: package ‘XML’ is not available (for R version 2.15.1)
>
> 2: In library(package, lib.loc = lib.loc, character.only =
> TRUE, logical.return = TRUE,  :
>
>   there is no package
> called ‘XML’
>
>
>
> Is there any possibilities to add the XML package for R
> version 2.15.1.
>
>
>
> Thanks in advance.
>
>
>
> Regards,
>
>
>
> Balaji
>
>
> [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
Anindya Sankar Dey

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.