[R] Java Exception error while reading large data in R from DB using RJDBC.

2012-10-30 Thread aajit75
Dear List, Java Exception error while reading large data in R from DB using RJDBC. I am trying to read large data from DB table(Vectorwise), using RJDBC connection. I have tested the connection with small size data and was able to fetch DB tables using same connection(conn as in my code). Pleas

[R] Solving binary integer optimization problem

2012-08-10 Thread aajit75
Hi, I am new to R for solving optimization problems, I have set of communication channels with limited capacity with two types of costs, fixed and variable cost. Each channel has expected gain for a single communication. I want to determine optimal number of communications for each channel maximiz

[R] Issues while using “lift.chart” and “adjProbScore” function from ”BCA” library

2012-05-24 Thread aajit75
Dear List, Couple of issues while using functions from “BCA” library: 1. I am trying to use “lift.chart” function from “BCA” library, but facing issues while using model where model formula is passed as formula object in glm. When model formula is written as text, then it works fine. In my case

Re: [R] Assign value to new variable based on conditions on other variables

2012-04-10 Thread aajit75
I have got solution using within function as below dd$Seg <- 1 dd <- within(dd, Seg[x2> 0 & x3> 200] <- 1) dd <- within(dd, Seg[x2> 100 & x3> 300] <- 2) dd <- within(dd, Seg[x2> 200 & x3> 400] <- 3) dd <- within(dd, Seg[x2> 300 & x3> 500] <- 4) I sthere any better way of doing it!! -- View th

[R] Assign value to new variable based on conditions on other variables

2012-04-10 Thread aajit75
Hi Experts, This may be simple question, I want to create new variable "seg" and assign values to it based on some conditions satisfied by each observation. Here is the example: ##Below are the conditions ##if variable x2 gt 0 and x3 gt 200 then seg should take value 1, ##if variable x2 gt 100

[R] Passing date as parameter while retrieving data from database using dbGetQuery

2012-02-15 Thread aajit75
Hi All, This might be simple question, I need to retrive data for modelling from the databases. Eveytime date values changes so I countnot fix date value in the code, it is required to pass as parameter. When I pass the date as parameter, it throws error. (ERROR: column "start_dt" does not exist

[R] Java heap space Error while reading table from postgres database using RJDBC

2012-02-09 Thread aajit75
Hi List, I am reading table from postgres database into R session using RJDBC, table contains 150 columns and 20 rows. Sample code is as below, which works fine with smaller tables. db_driver <- mydir$db_driver db_jar_fi

[R] Creating and assigning variable names in loop

2011-12-21 Thread aajit75
Hello List I am trying to create and assign variable names in loop, but not able to get expected variable names. Here is the sample code n = 10 set.seed(1) x1 = rnorm(n,0) x2 = rnorm(n,0) samp_data <- data.frame(x1,x2) for( i in 1:3) { label <- paste("score", i, sep="_") assign(label

[R] Calculating the probability of an event at time "t" from a Cox model fit

2011-12-19 Thread aajit75
Dear R-users, I would like to determine the probability of event at specific time using cox model fit. On the development sample data I am able to get the probability of a event at time point(t). I need probability score of a event at specific time, using scoring scoring dataset which will have o

[R] Any function\method to use automatically Final Model after bootstrapping using boot.stepAIC()

2011-11-29 Thread aajit75
Hi List, Being new to R, I am trying to apply boot.stepAIC() for Model selection by bootstrapping the stepAIC() procedure. I had gone through the discussion in various thread on the variable selection methods. Understood the pros and cons of various method, also going through the regression modelli

[R] Similar function for Redun() from Hmisc ?

2011-11-22 Thread aajit75
Hi List, Working on the large data frame (number of records=35000 and number of variables=160). Using redun() for dropping variables before using into model. V <- redun(~., data = data.frame, r2 = 0.8) It takes enormously high time for execution, is there anything wrong in the script? Suggest an

[R] Putting directory path as a parameter

2011-11-15 Thread aajit75
Hi List, I am new to R, this may be simple. I want to store directory path as parameter which in turn to be used while reading and writing data from csv files. How I can use dir defined in the below mentioned example while reading the csv file. Example: dir <- "C:/Users/Desktop" #location of

Re: [R] Decision tree model using rpart ( classification

2011-11-04 Thread aajit75
Hi, Thanks for the responce, code for each case is as: c_c_factor <- 0.001 min_obs_split <- 80 A) fit <- rpart(segment ~., method="class", control=rpart.control(minsplit=min_obs_split, cp=c_c_factor), data=Beh_cluster_out) B) fit <- rpart(segment ~., method="class",

[R] Decision tree model using rpart ( classification

2011-11-04 Thread aajit75
Hi Experts, I am new to R, using decision tree model for getting segmentation rules. A) Using behavioural data (attributes defining customer behaviour, ( example balances, number of accounts etc.) 1. Clustering: Cluster behavioural data to suitable number of clusters 2. Decision Tree: Using rpart

[R] Creating deciles on data using one variable

2011-11-02 Thread aajit75
I need to deciles data containing more than one variables using any one variable. I am using script below : id <-c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20) tot <-c(1230, 1230, 2345, 3456, 456, 4356, 123, 124, 987, 785, 5646, 345, 2345, 3456, 456, 4356, 123, 124, 987, 785) data <-

[R] How to get Quartiles when data contains both numeric variables and factors

2011-10-31 Thread aajit75
When data contains both factor and numeric variables, how to get quartiles for all numeric variables? n <- 100 x1 <- runif(n) x2 <- runif(n) x3 <- x1 + x2 + runif(n)/10 x4 <- x1 + x2 + x3 + runif(n)/10 x5 <- factor(sample(c('a','b','c'),n,replace=TRUE)) x6 <- factor(1*(x5=='a' | x5=='c')) dat

Re: [R] Data frame manipulation by eliminating rows containing extreme values

2011-10-23 Thread aajit75
Hi David, Thanks for the reply, f=function(x){quantile(x, c(0.25, 0.75),na.rm = TRUE) - matrix(IQR(x,na.rm = TRUE) * c(1.5), nrow = 1) %*% c(-1, 1)} Here parameter 1.5 is set for example in the above function as argument, it can be even more may be 3.0 after analyzing actual data. Here expecta

[R] Data frame manipulation by eliminating rows containing extreme values

2011-10-22 Thread aajit75
Dear All, I have got the limits for removing extreme values for each variables using following function . f=function(x){quantile(x, c(0.25, 0.75),na.rm = TRUE) - matrix(IQR(x,na.rm = TRUE) * c(1.5), nrow = 1) %*% c(-1, 1)} #Example: n <- 100 x1 <- runif(n) x2 <- runif(n) x3 <- x1 + x2 + runif(

Re: [R] How to remove multiple outliers

2011-10-21 Thread aajit75
Hi Michael, Thanks for the help. Yes, I have gone through the document for ?outlier. As it removes one outlier at a time, being new to R, I was woondering is there any function available for removing multiple outliers whithout calling say rm.outlier for n number of time because n is not finite he

[R] How to remove multiple outliers

2011-10-20 Thread aajit75
Hi All, I am working on the dataset in which some of the variables have more than one observations with outliers . I am using below mentioned sample script library(outliers) x1 <- c(10, 10, 11, 12, 13, 14, 14, 10, 11, 13, 12, 13, 10, 19, 18, 17, 10099, 10099, 10098) outlier_tf1 = outlier(x1,l

[R] Subsetting data by eliminating redundant variables

2011-10-19 Thread aajit75
Dear All, I am new to R, I have one question which might be easy. I have a large data with more than 250 variable, i am reducing number of variables by redun function as in the example below, n <- 100 x1 <- runif(n) x2 <- runif(n) x3 <- x1 + x2 + runif(n)/10 x4 <- x1 + x2 + x3 + runif(n)/10 x5 <