My apologies for the very delayed response - you caught us right
before the start of the Thanksgiving Holidays here in the USA - we are
back to work now, and so I'll take a look at this today.

On Wed, Nov 26, 2014 at 3:19 PM, Stefano Silvestri
<[email protected]> wrote:
> Hi Ted,
> as described in the previous email, I've launched my experiment. As said,
> the final step of my pipeline is the cluster labeling, using Sensclusters.
> I want to remember to you that the system performs an unsupervised relation
> extraction from the entities found in 988 clinical records (the entities
> have been extracted through UMLS databases and we cluster the couples of
> entities).
>
> To integrate Sunslusters cluster_label in our system, I've produced a
> cluto-style output for the clustering results (around 160000 elements) and
> an rlabel file (same number), with the list of all the clustered elements.
> At this point, I have problems in running format_cluster.
>
> To perform the labeling, I need the the format_cluster's output, generated
> with the --context option. So, I've created a senseval-2 file with
> text2sval.pl. The input file of text2sval is a plain text with each whole
> clinical record on each line.
> Naturally, each context contains more than one cluster members.
> I haven't used any optional argument in text2sval.
>
> This output has 988 instance ids. Now, when I try to launch format_cluster,
> I have the following error, occurring during the parse of the senseval file:
> Use of uninitialized value $sentence in pattern match (m//) at
> ../.cpan/build/Text-SenseClusters-1.03-FMoSjn/Toolkit/evaluate/format_clusters.pl
> line 309, <SCON> line 5938. (when it reaches the last line of senseval2
> file).
>
> I'm thinking that the context used are wrong... so my question are:
> 1) do I have to put in the context only the extracted entities or the
> relations?
> 2) Do the contexts must be in the same number of clustered elements?
> 3) If nothing is (theoretically) wrong, what should be the error in the
> sense-eval file?
>
> I'm waiting for your response...
> Thank you for the attention and I hope that you can help us to complete our
> research.
>
>
> 2014-10-23 16:02 GMT+02:00 Stefano Silvestri <[email protected]>:
>>
>> Hi Ted and thanks.
>>
>> The PoS tagging, entity recognition, feature extraction and the clustering
>> tasks have been created with our system (not Senseclusters) - still in
>> developement.
>> Now I'm trying to use the cluster_labeling module of SenseClusters to show
>> that we have found, in a unsupervised approach, the relation between medical
>> entities in the clinical records (i.e. diabetes mellitus <> glycemia) and
>> have, in this way, some labels for the clusters.
>>
>> I'm now writing the code to create the context files and then I'll run the
>> experiments on cluster labeling. I'll let you know in a few days if
>> everything worked well and, in case of a new publication, I'll cite your
>> great work.
>>
>> I'm sure that I will ask some more things in the next days, so I thank you
>> in advance.
>> Stefano Silvestri
>>
>>
>> 2014-10-23 15:07 GMT+02:00 Ted Pedersen <[email protected]>:
>>>
>>> Hi Stefano,
>>>
>>> This sounds like an interesting project, and it's good to know
>>> SenseClusters is proving to be useful. See my responses inline...
>>>
>>> On Wed, Oct 22, 2014 at 5:58 AM, Stefano Silvestri
>>> <[email protected]> wrote:
>>> > I've used a clustering techniques to discover, in an unsupervised way,
>>> > relations between medical entities contained in a large collection of
>>> > anonymized medical records, in a reserch project of University of
>>> > Neaples.
>>> > The data set is composed by a large set of features - all the results
>>> > will
>>> > be shortly published on a journal.
>>> >
>>> > The next step in the development of our system is performing an
>>> > unsupervised
>>> > cluster (relation) labeling. To do that, I think to try the
>>> > clusterlabeling
>>> > module from Senseclusters. For creating the input to clusterlabeling I
>>> > have
>>> > to use format_clusters module with --context option and now I have some
>>> > problems.
>>> >
>>> > I have already produced a cluto-style cluster solution file (no problem
>>> > for
>>> > that) from my system.
>>> >
>>> > The rlabel file, if I'm right, is a file containing the explicit
>>> > corresponding name of each entity in the cluster (in my case the
>>> > relation).
>>> > Is that right?
>>>
>>> Yes, rlabel shows the cluster to which each instance has been assigned.
>>>
>>> >
>>> > And now the problems about the context file...
>>> > It should be in senseval2 format. My experimental assesment is made of
>>> > a
>>> > plain text files - so I should use plain text to headless senseval2
>>> > utility.
>>> >
>>> > I have some questions.
>>> >
>>> > 1) Does the context file have to put together all my input files (the
>>> > medical records) in one large file (and each context must correspond to
>>> > a
>>> > medical record)?
>>>
>>> Yes, the input for each run of SenseClusters should be a single file
>>> with all your contexts included.
>>>
>>> >
>>> > 2) Does the contexts be headless, or I have to tag (<head></head>) all
>>> > the
>>> > entities (medical names) in input?
>>>
>>> Your contexts can be headless, and so there is no need to include
>>> <head> tags in your contexts.
>>>
>>> >
>>> > 3) Are other costrains in the context files (formatting, tags, or
>>> > other)?
>>> >
>>>
>>> There shouldn't be. The output from text2sval.pl should be acceptable
>>> for input "as is".
>>>
>>> > In case of success of the experiments, of course, I'll credit and cite
>>> > the
>>> > Senseclusters project.
>>> >
>>> > PS - my system works on italian language.
>>>
>>> That's great! We'd be happy to answer further questions as they arise,
>>> and will be curious to know how things work out!
>>>
>>> Good luck,
>>> Ted
>>>
>>> >
>>> > Thanks for response,
>>> > Stefano Silvestri,
>>> > NLP researcher at University of Neaples "Federico II"
>>> >
>>> >
>>> > ------------------------------------------------------------------------------
>>> > Comprehensive Server Monitoring with Site24x7.
>>> > Monitor 10 servers for $9/Month.
>>> > Get alerted through email, SMS, voice calls or mobile push
>>> > notifications.
>>> > Take corrective actions from your mobile device.
>>> > http://p.sf.net/sfu/Zoho
>>> > _______________________________________________
>>> > senseclusters-users mailing list
>>> > [email protected]
>>> > https://lists.sourceforge.net/lists/listinfo/senseclusters-users
>>> >
>>>
>>>
>>>
>>> --
>>> Ted Pedersen
>>> http://www.d.umn.edu/~tpederse
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> _______________________________________________
>>> senseclusters-users mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/senseclusters-users
>>
>>
>
>
> ------------------------------------------------------------------------------
> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
> with Interactivity, Sharing, Native Excel Exports, App Integration & more
> Get technology previously reserved for billion-dollar corporations, FREE
> http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk
> _______________________________________________
> senseclusters-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/senseclusters-users
>

------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk
_______________________________________________
senseclusters-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/senseclusters-users

Reply via email to