Hello i have problem with design of schema in Solr. I have a transcript of a
telephone conversation in this format. I parse it at individual fields. I
have this schema:

<?xml version="1.0"?>
<add>
<doc>
<field name="id">01.cn</field>
<field name="t">0<br /> 1<br /> 2<br /> 2 <br /> 3 <br /> ....</field>
<field name="st">0.00<br /> 1.54<br /> 1.54<br /> 1.54 <br /> 1.57 <br />
....</field>
<field name="et">1.54<br /> 1.54<br /> 1.57<br /> 1.57 <br /> 1.7 <br />
....</field>
<field name="w">_SILENCE_<br /> <s><br /> HELLO<br /> HALLO <br /> _DELETE_
<br /> ....</field>
<field name="p">0.000000<br /> 1<br /> 1<br /> 2.06115e-009 <br /> 1 <br />
....</field>
<field name="c">0<br /> 0<br /> 0<br /> 0 <br /> 0 <br /> ....</field>
</doc>
</add>

I displayed it in html document, and therefore i used the <br />.

This is a original document:

T=0 ST=0.00 ET=1.54 W=_SILENCE_ P=0.000000 C=0
T=1 ST=1.54 ET=1.54 W=<s> P=1 C=0
T=2 ST=1.54 ET=1.57 W=HELLO P=1 C=0
T=2 ST=1.54 ET=1.57 W=HALLO P=2.06115e-009 C=0
T=3 ST=1.57 ET=1.70 W=_DELETE_ P=1 C=0
T=3 ST=1.57 ET=1.70 W=NO P=2.06115e-009 C=0
T=4 ST=1.70 ET=2.12 W=HOW P=1 C=0
T=5 ST=2.12 ET=2.18 W=ARE_ P=0.25 C=0
T=5 ST=2.12 ET=2.18 W=_DELETE_ P=0.25 C=0
..........................................
..........................................

Id - filename
T = Segment
ST = Start time
ET = End time
W = Word
P = Probability
C = Chanel

I want to search for example word which is to time 1.57 (w:HeLLO) AND (t:[0
TO 1.57]). But if i have all data in one field (t, st,et ...) then it
doesn't work. It find all files where is hello a further time than 1.57.

Do you have any ideas how it make it? Thanks a lot for your help.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Design-optimal-Solr-Schema-tp4166632.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to