Re: [Scikit-learn-general] Tackling Dataset bias

2013-08-19 Thread Nigel Legg
Some interesting looking references there, will take a look - thanks! Regards, Nigel Legg 07914 740972 http://www.trevanianlegg.co.uk http://twitter.com/nigellegg http://uk.linkedin.com/in/nigellegg On 19 August 2013 17:46, Peter Prettenhofer wrote: > Hi Yogesh, > > the work by John Blitzer th

Re: [Scikit-learn-general] Tackling Dataset bias

2013-08-19 Thread Lee Zamparo
Thanks for the pointers Peter. I'm doing an unrelated project on covariate shift, and this will be really useful. Lee. On Mon, Aug 19, 2013 at 12:46 PM, Peter Prettenhofer wrote: > Hi Yogesh, > > the work by John Blitzer that I mentioned used the second approach -- its > described here: > > Bli

Re: [Scikit-learn-general] Tackling Dataset bias

2013-08-19 Thread Peter Prettenhofer
Hi Yogesh, the work by John Blitzer that I mentioned used the second approach -- its described here: Blitzer, J., Dredze, M., Pereira, F., Jun. 2007. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In: Proceedings of ACL, Prague, Czech Republic, pp

Re: [Scikit-learn-general] Tackling Dataset bias

2013-08-19 Thread Yogesh Karpate
Hi Folks, Thanks a lot for suggesting me good references! @ Peter : You can send me the more ref. @ Gael : WIsh you a speedy recovery! @ Olivier : Thanks a lot for listening my problem quitely and asking for clarifications. Next time and onwards I will try to be more specific explain

Re: [Scikit-learn-general] Tackling Dataset bias

2013-08-19 Thread Gael Varoquaux
Hi list, Coming back from travel, with a slight elbow injury that makes typing difficult... Anyhow, I just wanted to stress that a lot of good advice has been put forward in the discussion so far, and that, when we find time, I think that a subsection of the docs dealing on class-imbalance, covar

Re: [Scikit-learn-general] Tackling Dataset bias

2013-08-16 Thread Peter Prettenhofer
In order to assess if dataset shift has indeed occurred I usually do the following: create a classification task to distinguish between the two datasets (eg the dataset from country A is the pos class, dataset from country B is negative). Then I compare the classification loss (I usually use traini

Re: [Scikit-learn-general] Tackling Dataset bias

2013-08-16 Thread Olivier Grisel
Alright thanks for the clarification. > So my question: Is there any other way to tackle the problem like "Transfer > Learning", "Zero-shot learning"? Any experience doing such task? We don't have turn-key tools in scikit-learn for transfer learning nor zero-shot learning. It would be interestin

Re: [Scikit-learn-general] Tackling Dataset bias

2013-08-15 Thread Yogesh Karpate
Thanks a lot Olivier for suggesting Alex Blog. My apologies!! I rephrase my problem. I have two data set of Brain MR images, lets call it A and B. A is acquired in one country and B in another. The data-set A contains both patients having pathology and healthy volunteers where as data-set B contain

Re: [Scikit-learn-general] Tackling Dataset bias

2013-08-15 Thread Olivier Grisel
I don't really understand what are the samples, the labels and the features in your case and how much unlabeled data do you have and what do you mean by "I have completed the classification task on 1st database.": if you have labeled datasets what does "completion of the classification task" mean?.

[Scikit-learn-general] Tackling Dataset bias

2013-08-15 Thread Yogesh Karpate
Hello Folks ! I have two different brain MR image databases acquired across two different countries. I need to perform patch based supervised binary classification task (+ pathology and - Normal). The 1st database contains both +pathology patients and -normal subjects whereas second