Re: [Pharo-users] Information Theory

2016-06-16 Thread Brice GOVIN
Information theory is about quantifying and qualifying the content of an 
information in a data set.
Basically, it means that for a specific dataset I could say which data is 
interesting or not (according to the algorithm I use).

It is used in information retrieval (IR).

I should have started with that maybe…

Actually, I made mistake talking about information theory, it is more about 
information retrieval (my bad..). However, with the 
Moose-Algo-Information-Retrieval, I have only a set of words that is used in 
documents but I would like to know if there was an effort on any algorithm to 
qualify these words ?
There different kinds of model:
- set-theoretic model
- documents are represented as set of words or phrases and similarity derives 
from the set-theoretic operations on those sets (I don’t understand so much 
this one for now..)
- Common techniques are Boolean Model (several kinds) and Fuzzy Retrieval
- algebraic model
- documents and queries are represented as vectors (or matrices or tuples) and 
similarity is computed between query and document thanks to this representation
- Common techniques are Latent Semantic Indexing (LSI), Vector Space Model 
(several kinds)
- probabilistic model
- there is no particular representation for documents here. Similarity is 
computed using the probability the document is relavant for the query
- Common techniques are Latent Dirichlet Allocation or others


I’m more about using an algebraic model and maybe is there something on Latent 
Semantic Indexing?

I’m not sure, I explained my thinking well …

Regards,
--
Brice Govin
PhD student in RMoD research team at INRIA Lille
Software Engineer at THALES AIR SYSTEMS Rungis
ENSTA-Bretagne ENSI2014
22 Avenue du General Leclerc 92340 BOURG-LA-REINE

On 15 Jun 2016, at 22:54, Alexandre Bergel 
<alexandre.ber...@me.com<mailto:alexandre.ber...@me.com>> wrote:

Yes, tell us more.
This is an interesting topic

Alexandre
--
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
Alexandre Bergel  http://www.bergel.eu<http://www.bergel.eu/>
^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.



On Jun 15, 2016, at 4:38 PM, stepharo 
<steph...@free.fr<mailto:steph...@free.fr>> wrote:



Le 15/6/16 à 19:03, Brice GOVIN a écrit :
Hi,
I'd like to know if someone did a work on information theory algorithm ?

tell us more.
What is it?

I saw a package in Moose about information theory but it is just a kind of 
document indexation.

Is there something more complete (quantities information)?

Thanks,

--
Brice Govin
PhD student in RMoD research team at INRIA Lille
Software Engineer at THALES AIR SYSTEMS Rungis
ENSTA-Bretagne ENSI2014
22 Avenue du General Leclerc 92340 BOURG-LA-REINE






Re: [Pharo-users] Information Theory

2016-06-16 Thread Brice GOVIN
Information theory is about quantifying and qualifying the content of an 
information in a data set.
Basically, it means that for a specific dataset I could say which data is 
interesting or not (according to the algorithm I use).

It is used in information retrieval (IR).

I should have started with that maybe…

Actually, I made mistake talking about information theory, it is more about 
information retrieval (my bad..). However, with the 
Moose-Algo-Information-Retrieval, I have only a set of words that is used in 
documents but I would like to know if there was an effort on any algorithm to 
qualify these words ?
There different kinds of model:
- set-theoretic model
- documents are represented as set of words or phrases and similarity derives 
from the set-theoretic operations on those sets (I don’t understand so much 
this one for now..)
- Common techniques are Boolean Model (several kinds) and Fuzzy Retrieval
- algebraic model
- documents and queries are represented as vectors (or matrices or tuples) and 
similarity is computed between query and document thanks to this representation
- Common techniques are Latent Semantic Indexing (LSI), Vector Space Model 
(several kinds)
- probabilistic model
- there is no particular representation for documents here. Similarity is 
computed using the probability the document is relavant for the query
- Common techniques are Latent Dirichlet Allocation or others


I’m more about using an algebraic model and maybe is there something on Latent 
Semantic Indexing?

I’m not sure, I explained my thinking well …

Regards,
--
Brice Govin
PhD student in RMoD research team at INRIA Lille
Software Engineer at THALES AIR SYSTEMS Rungis
ENSTA-Bretagne ENSI2014
22 Avenue du General Leclerc 92340 BOURG-LA-REINE

On 15 Jun 2016, at 22:54, Alexandre Bergel 
<alexandre.ber...@me.com<mailto:alexandre.ber...@me.com>> wrote:

Yes, tell us more.
This is an interesting topic

Alexandre
--
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
Alexandre Bergel  http://www.bergel.eu<http://www.bergel.eu/>
^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.



On Jun 15, 2016, at 4:38 PM, stepharo 
<steph...@free.fr<mailto:steph...@free.fr>> wrote:



Le 15/6/16 à 19:03, Brice GOVIN a écrit :
Hi,
I'd like to know if someone did a work on information theory algorithm ?

tell us more.
What is it?

I saw a package in Moose about information theory but it is just a kind of 
document indexation.

Is there something more complete (quantities information)?

Thanks,

--
Brice Govin
PhD student in RMoD research team at INRIA Lille
Software Engineer at THALES AIR SYSTEMS Rungis
ENSTA-Bretagne ENSI2014
22 Avenue du General Leclerc 92340 BOURG-LA-REINE






[Pharo-users] Information Theory

2016-06-15 Thread Brice GOVIN
Hi,
I'd like to know if someone did a work on information theory algorithm ? I saw 
a package in Moose about information theory but it is just a kind of document 
indexation.

Is there something more complete (quantities information)?

Thanks,

--
Brice Govin
PhD student in RMoD research team at INRIA Lille
Software Engineer at THALES AIR SYSTEMS Rungis
ENSTA-Bretagne ENSI2014
22 Avenue du General Leclerc 92340 BOURG-LA-REINE



Re: [Pharo-users] How to access XML tag name?

2016-03-11 Thread Brice GOVIN

Hi,

according to what have found in the SAXHandler and XMLDOMParser class, you 
should use SAXHandler>>characters: method to get the information inside a 
marker.

Information inside a marker is consider as an XMLString that is created after a 
call to XMLDOMParser>>characters:.


I hope I helped you with your question. =)



------
Brice Govin
PhD student in RMoD research team at INRIA Lille
Software Engineer at THALES AIR SYSTEMS Rungis
ENSTA-Bretagne ENSI2014

22 Avenue du General Leclerc 92340 BOURG-LA-REINE




De : stepharo [via Smalltalk] <ml-node+[hidden 
email]>
Envoyé : vendredi 11 mars 2016 13:37
À : Brice GOVIN
Objet : How to access XML tag name?

Hi

Yesterday I started to hack a smart SAX handler. The idea is that I want
to just specify the tags
I want to visit and the SAX handler should invoke (generated) visit
methods. Like that I can easily
get visitors on XML domain.

Here is an example of what I did.

 | h |
 h := SmartSAXHandler new
 visitor: (MyFilmVisitor new visitTags: #(FILM ROLE));
 on: FileSystem workingDirectory / 'FILMS.XML'.
 h parseDocument.
 ^ h



SmartSAXHAndler >> startElement: aQualifiedName attributes: aDictionary

 (visitor shouldVisit: aQualifiedName)
 ifTrue: [
 visitor
 perform: (visitor createdVisitSelector: aQualifiedName)
 with: aQualifiedName
 with: aDictionary
 ]


Object subclass: #GenericTagSAXVisitor
 instanceVariableNames: 'visitTag visitTags'
 classVariableNames: ''
 category: 'SmartXMLHandler'


GenericTagSAXVisitor>>visitTags: aCollection
 "set the tags that will lead to a call to a visitTag:with: method
in the visitor"

 visitTags := aCollection collect: [ :each | each asLowercase ].
 self createVisitMethods.

and in a subclass the visit* methods automatically generated


Now I could not get when I have a Vertigo where I can get
the Vertigo information.
I redefined several methods of SAXhandler but without success.




   
 Vertigo
 Drame
 USA
 
 
   
 James
 Stewart
 John Ferguson
   
   
 Kim
 Novak
 Madeleine Elster
   
 
 Scottie Ferguson, ancien inspecteur de police, est sujet
au vertige depuis qu'il a vu mourir son
  collegue. Elster, son ami, le charge de surveiller sa femme,
Madeleine, ayant des tendances
  suicidaires. Amoureux de la jeune femme Scottie ne remarque pas le
piege qui se trame autour
  de lui et dont il va etre la victime... 
   




If you reply to this email, your message will be added to the discussion below:
http://forum.world.st/How-to-access-XML-tag-name-tp4884065.html
To unsubscribe from Pharo Smalltalk Users, click here.
NAML<http://forum.world.st/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>


View this message in context: RE: How to access XML tag 
name?<http://forum.world.st/How-to-access-XML-tag-name-tp4884065p4884099.html>
Sent from the Pharo Smalltalk Users mailing list 
archive<http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html> at 
Nabble.com.