Re: Problems with beagle-0.3.1 with .htm- files

2008-01-07 Thread Rainer Krienke
Am Sonntag, 6. Januar 2008 14:41:24 schrieben Sie:
> > > This is a problem in recognizing mimetypes of the files. Beagle uses
> > > freedesktop.org spec shared-mime-info and implementation xdgmime to
> > > determine the type of a file. In this case, shared-mime-info diagnosed
> > > those files are mozilla-bookmark files. Mozilla-bookmark files have a
> > > slightly different structure, so beagle does not index them. :(
> >
> > Thanks for the explanation. However I am not sure why mime detection
> > fails. The head of the file looks like this:
>
> I completely forgot that I myself commented on the freedesktop.org bug
> for this :(
> You are facing this problem,
> https://bugs.freedesktop.org/show_bug.cgi?id=11843
>

Thanks that helped.

I found a new SuSE shared-mime-info package (that will be part of the next 
openSuSE distribution release) that seems to contain a fixed version of this 
problem. After I installed it beagle now does index my ".htm" files without 
problems.

Thanks
Rainer
-- 
Rainer Krienke, Uni Koblenz, Rechenzentrum, A22, Universitaetsstrasse  1
56070 Koblenz, http://www.uni-koblenz.de/~krienke, Tel: +49261287 1312
PGP: http://www.uni-koblenz.de/~krienke/mypgp.html,Fax: +49261287 1001312


signature.asc
Description: This is a digitally signed message part.
___
Dashboard-hackers mailing list
Dashboard-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/dashboard-hackers


Re: Problems with beagle-0.3.1 with .htm- files

2008-01-06 Thread D Bera
> > This is a problem in recognizing mimetypes of the files. Beagle uses
> > freedesktop.org spec shared-mime-info and implementation xdgmime to
> > determine the type of a file. In this case, shared-mime-info diagnosed
> > those files are mozilla-bookmark files. Mozilla-bookmark files have a
> > slightly different structure, so beagle does not index them. :(
>
> Thanks for the explanation. However I am not sure why mime detection fails.
> The head of the file looks like this:

I completely forgot that I myself commented on the freedesktop.org bug
for this :(
You are facing this problem,
https://bugs.freedesktop.org/show_bug.cgi?id=11843

Some distros patched their shared-mime-info to fix this and later it
was probably fixed upstream.
http://www.nabble.com/Re:-Right-way-to-sniff-mimetype-from-globs-p13722083.html

The xdg-mime executable doesnt directly use shared-mime-info in the
recommended way, it does something with kfile on kde and gnomevfs-info
on gnome (dont remember fully, but it does not follow the usual
steps).

- dBera

-- 
-
Debajyoti Bera @ http://dtecht.blogspot.com
beagle / KDE fan
Mandriva / Inspiron-1100 user
___
Dashboard-hackers mailing list
Dashboard-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/dashboard-hackers


Re: Problems with beagle-0.3.1 with .htm- files

2008-01-05 Thread Rainer Krienke
Am Freitag, 4. Januar 2008 schrieben Sie:
> > I am running beagle 0.3.1 (compiled manually) on a openSuSE 10.3 system.
> > I have a directory that contains the online articles of a german computer
> > magazine named "ct".  The files contained are mostly named ".htm" and
> > contain plain html text.
> >
> > When I try to index the "ct"  directory with beagle it complains with the
> > following error message in file current-IndexHelper:
> >
> > 20080104 09:49:03.9370 17841 IndexH DEBUG: No filter for
> > file:///opt/zeitschriften/ct/html/05/19/220/art.htm
> > (/opt/zeitschriften/ct/html/05/19/220/art.htm)
> > [application/x-mozilla-bookmarks]
>
> This is a problem in recognizing mimetypes of the files. Beagle uses
> freedesktop.org spec shared-mime-info and implementation xdgmime to
> determine the type of a file. In this case, shared-mime-info diagnosed
> those files are mozilla-bookmark files. Mozilla-bookmark files have a
> slightly different structure, so beagle does not index them. :(

Thanks for the explanation. However I am not sure why mime detection fails. 
The head of the file looks like this:





 Buchkritik
: Mac OS X, SpamAssassin



So actually its text/html. Next if I query xdg-mime  for the mimetype of this 
file it says also text/html:

$ xdg-mime query filetype /opt/zeitschriften/ct/html/05/19/220/art.htm
text/html

So my question is how exactly beagle determines the mime type of a file when 
trying to index it? This could help me to fix the problem.

Thanks
Rainer


-- 
Rainer Krienke, Uni Koblenz, Rechenzentrum, A22, Universitaetsstrasse  1
56070 Koblenz, Web: http://www.uni-koblenz.de/~krienke, Tel: +49261287 1312
PGP: http://www.uni-koblenz.de/~krienke/mypgp.html, Fax: +49261287 1001312


signature.asc
Description: This is a digitally signed message part.
___
Dashboard-hackers mailing list
Dashboard-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/dashboard-hackers


Re: Problems with beagle-0.3.1 with .htm- files

2008-01-05 Thread Joe Shaw
Hi,

On Jan 4, 2008 5:23 PM, D Bera <[EMAIL PROTECTED]> wrote:
> > 20080104 09:49:03.9370 17841 IndexH DEBUG: No filter for
> > file:///opt/zeitschriften/ct/html/05/19/220/art.htm
> > (/opt/zeitschriften/ct/html/05/19/220/art.htm)
> > [application/x-mozilla-bookmarks]
>
> This is a problem in recognizing mimetypes of the files. Beagle uses
> freedesktop.org spec shared-mime-info and implementation xdgmime to
> determine the type of a file. In this case, shared-mime-info diagnosed
> those files are mozilla-bookmark files.

Yeah, this isn't a Beagle-specific problem.  If you google for
"application/x-mozilla-bookmarks" you'll find a bunch of similar
problems.

> Mozilla-bookmark files have a
> slightly different structure, so beagle does not index them. :(

Really?  Aren't they just a specialized type of HTML?

Joe
___
Dashboard-hackers mailing list
Dashboard-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/dashboard-hackers


Re: Problems with beagle-0.3.1 with .htm- files

2008-01-04 Thread D Bera
> I am running beagle 0.3.1 (compiled manually) on a openSuSE 10.3 system. I
> have a directory that contains the online articles of a german computer
> magazine named "ct".  The files contained are mostly named ".htm" and contain
> plain html text.
>
> When I try to index the "ct"  directory with beagle it complains with the
> following error message in file current-IndexHelper:
>
> 20080104 09:49:03.9370 17841 IndexH DEBUG: No filter for
> file:///opt/zeitschriften/ct/html/05/19/220/art.htm
> (/opt/zeitschriften/ct/html/05/19/220/art.htm)
> [application/x-mozilla-bookmarks]

This is a problem in recognizing mimetypes of the files. Beagle uses
freedesktop.org spec shared-mime-info and implementation xdgmime to
determine the type of a file. In this case, shared-mime-info diagnosed
those files are mozilla-bookmark files. Mozilla-bookmark files have a
slightly different structure, so beagle does not index them. :(

Technically this is a problem with shared-mime-info. They should have
better rules for deciding the right mimetypes. However, determining
mimetype correctly 100% of the time is impossible; you can try to file
against shared-mime-info and if they fix it, good and fine. Most
likely the .htm files have some incorrect line in the beginning which
make them look like mozilla-bookmark files.

There is one last resort in beagle; needs a bit work. If a file has an
extended attribute "user.mimetype" then beagle will use its value
instead of trying to determine the mimetype. If you can manage to set
the extended attribute to "text/html" for all those files, then beagle
will index them.

I am sorry thats the best solution I have right now.

- dBera

-- 
-
Debajyoti Bera @ http://dtecht.blogspot.com
beagle / KDE fan
Mandriva / Inspiron-1100 user
___
Dashboard-hackers mailing list
Dashboard-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/dashboard-hackers


Problems with beagle-0.3.1 with .htm- files

2008-01-04 Thread Rainer Krienke
Hello,

I am running beagle 0.3.1 (compiled manually) on a openSuSE 10.3 system. I 
have a directory that contains the online articles of a german computer 
magazine named "ct".  The files contained are mostly named ".htm" and contain 
plain html text.

When I try to index the "ct"  directory with beagle it complains with the 
following error message in file current-IndexHelper:

20080104 09:49:03.9370 17841 IndexH DEBUG: No filter for 
file:///opt/zeitschriften/ct/html/05/19/220/art.htm 
(/opt/zeitschriften/ct/html/05/19/220/art.htm) 
[application/x-mozilla-bookmarks]

The contents of all these .htm files are not indexed and logically I cannot 
search the archive using beagle.

Any idea how I could solve this problem and get the .htm indexed?

Thanks
Rainer
-- 
Rainer Krienke, Uni Koblenz, Rechenzentrum, A22, Universitaetsstrasse  1
56070 Koblenz, http://www.uni-koblenz.de/~krienke, Tel: +49261287 1312
PGP: http://www.uni-koblenz.de/~krienke/mypgp.html,Fax: +49261287 1001312


signature.asc
Description: This is a digitally signed message part.
___
Dashboard-hackers mailing list
Dashboard-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/dashboard-hackers