Re: dmd json file output

2013-01-26 Thread Walter Bright

On 1/26/2013 2:25 AM, Rainer Schuetze wrote:

I updated dmd from github and had a look at the current json output: it's
horrible. Below is a random example of a simple function.


Yeah, it's pretty bad.



Re: dmd json file output

2013-01-26 Thread Rainer Schuetze



On 23.01.2013 06:42, Andrei Alexandrescu wrote:

On 1/22/13 3:36 PM, Walter Bright wrote:

On 1/22/2013 11:46 AM, Andrei Alexandrescu wrote:

On 1/22/13 2:48 AM, Walter Bright wrote:

On 1/21/2013 10:56 PM, ric wrote:

Would it be reasonable to put an option whether to produce the (too)
verbose
json output or the minimal one?


I'd rather we make a decision.


Verbose should probably be it.


Rationale?


You can always filter out the verboseness with a simple program, but you
can't add missing information.

If the efficiency of generating json ever comes up, _then_ it's worth
looking into an option that produces less verbose output directly. For
now be verbose and let downstream tools filter it out.


Andrei



I updated dmd from github and had a look at the current json output: 
it's horrible. Below is a random example of a simple function.


- the function parameters are listed three times with different type 
information


- "originalType" seems to be always shown, even though it probably was 
meant to if it is different from "type"


- if the parameter identifiers are listed separately anyway, they should 
not be part of the type while the types do not have to be repeated n the 
actual parameter list


- package and module are specified inconsistently, sometimes as an array 
of strings, sometimes in dot-notation, sometimes not at all.


- types are sometimes shown expanded, sometimes not (e.g. "string")

- template instantiations from imported source files are listed

- functions and template instantiations that are only used at compile 
time are listed


- I appreciate that some missing information has been added, like 
imports and storage class


- renamed imports don't show the original module name

- functions implemented through template mixins are not listed

- surprisingly the average output has only become about 10 times larger 
for a medium sized project like Visual D (73 MB instead of 8 MB). Having 
only std.json available for reading it, I suspect it will definitely 
have an impact on IDE performance, though.


I understand that most of these issues are QOI issues but it also seems 
that there is also a shift in the target usage of the JSON output. It 
was a means for source code browsing with output similar to generated di 
files, while it is now showing everything written into object files 
similar to debug info. Some of this can easily be filtered out (e.g. 
"template instance") but not all (e.g. functions from other modules only 
used in CTFE).


So I think that we should remove excessive bloat (e.g. always specify 
package and module lists in dot notation), make output more consistent 
and avoid listing the same type again and again. If a type is specified 
by its mangled name in declarations, add it to a dictionary at the end 
of the json file in its full verbosity. (I agree core.demangle does not 
help you if you want to do anything more than just getting the pretty 
type string). Please be aware that you will have to document the JSON 
type format in addition to the existing name mangling, though.


Rainer


JSON output for
void setAttribute(Element elem, string attr, string val);

dmd 2.061:
{
"name" : "setAttribute",
"kind" : "function",
"protection" : "public",
"type" : "void(Element elem, string attr, string val)",
"line" : 37}
,


dmd 2.062alpha:
   {
"name" : "setAttribute",
"kind" : "function",
"loc" : {
 "line" : 37
},
"module" : {
 "name" : "xmlwrap",
 "kind" : "module",
 "package" : [
  "visuald"
 ],
 "prettyName" : "visuald.xmlwrap"
},
"type" : {
 "kind" : "function",
 "pretty" : "void(Element elem, string attr, string val)",
 "returnType" : {
  "kind" : "void",
  "pretty" : "void"
 },
 "parameters" : [
  {
   "name" : "elem",
   "type" : {
"kind" : "class",
"pretty" : "std.xml.Element"
   }
  },
  {
   "name" : "attr",
   "type" : {
"kind" : "darray",
"pretty" : "string",
"elementType" : {
 "kind" : "char",
 "pretty" : "immutable(char)",
 "modifiers" : " immutable"
}
   }
  },
  {
   "name" : "val",
   "type" : {
"kind" : "darray",
"pretty" : "string",
"elementType" : {
 "kind" : "char",
 "pretty" : "immutable(char)",
 "modifiers" : " immutable"
}
   }
  }
 ]
},
"originalType" : {
 "kind" : "function",
 "pretty" : "void(Element elem, string attr, string val)",
 "returnType" : {
  "kind" : "void",
  "pretty" : "void"
 },
 "parameters" : [
  {
   "name" : "elem",
   "type" : {
"kind" : "identifier",
"pretty" : "Element",
"idents" : [],
"rawIdentifier" : "Element",
"identifier" : "Element"
   }
  },
  {
   "name" : "attr",
   "type" : {
"kind" : "identifier",
"pretty" : 

Re: dmd json file output

2013-01-23 Thread Jacob Carlborg

On 2013-01-23 22:38, Nathan M. Swan wrote:


Not every program using the json output will be in D (especially IDEs).


That's a good point. I don't think it hurts to have both. It's still a 
lot less code/text than the current format.


--
/Jacob Carlborg


Re: dmd json file output

2013-01-23 Thread Nathan M. Swan
On Wednesday, 23 January 2013 at 20:02:36 UTC, Jacob Carlborg 
wrote:

On 2013-01-23 18:09, Timon Gehr wrote:


That still requires at least one of two secondary parsers.


Technically yes, but there's already a demangler available in 
Phobos/druntime.


Not every program using the json output will be in D (especially 
IDEs).


NMS


Re: dmd json file output

2013-01-23 Thread Walter Bright

On 1/23/2013 12:02 PM, Jacob Carlborg wrote:

On 2013-01-23 18:09, Timon Gehr wrote:


That still requires at least one of two secondary parsers.


Technically yes, but there's already a demangler available in Phobos/druntime.



Yup. The "pretty" attribute is completely redundant.


Re: dmd json file output

2013-01-23 Thread Jacob Carlborg

On 2013-01-23 18:09, Timon Gehr wrote:


That still requires at least one of two secondary parsers.


Technically yes, but there's already a demangler available in 
Phobos/druntime.


--
/Jacob Carlborg


Re: dmd json file output

2013-01-23 Thread Timon Gehr

On 01/23/2013 05:07 PM, Jacob Carlborg wrote:

On 2013-01-23 08:58, Andrei Alexandrescu wrote:


If we need a secondary parser to slice and dice the json output, we
failed producing good json output.


That's what I'm saying. Just use what Rainer suggested:

"type" : {
 "mangled" : "PPPi",
 "pretty" : "int***",
}



That still requires at least one of two secondary parsers.


Re: dmd json file output

2013-01-23 Thread Andrei Alexandrescu

On 1/23/13 11:07 AM, Jacob Carlborg wrote:

On 2013-01-23 08:58, Andrei Alexandrescu wrote:


If we need a secondary parser to slice and dice the json output, we
failed producing good json output.


That's what I'm saying. Just use what Rainer suggested:

"type" : {
"mangled" : "PPPi",
"pretty" : "int***",
}


Yes please. Err on the side of verboseness as long as filtering out the 
unnecessary output is easy.


Andrei



Re: dmd json file output

2013-01-23 Thread Jacob Carlborg

On 2013-01-23 08:58, Andrei Alexandrescu wrote:


If we need a secondary parser to slice and dice the json output, we
failed producing good json output.


That's what I'm saying. Just use what Rainer suggested:

"type" : {
"mangled" : "PPPi",
"pretty" : "int***",
}

--
/Jacob Carlborg


Re: dmd json file output

2013-01-23 Thread Andrei Alexandrescu

On 1/23/13 2:41 AM, Jacob Carlborg wrote:

On 2013-01-22 20:53, Sönke Ludwig wrote:


Consider "int[4u] delegate(scope float*[void function(scope int)] p1,
Rebindable!(const(C))*[]* b)"

There are actually quite some things to parse in human readable type
strings, I even remember some
expressions. And parsing this is at least as language specific as the
mangled name. But I agree that
having both should be a good compromise.


This wouldn't be fun to parse. It basically requires a front end.


If we need a secondary parser to slice and dice the json output, we 
failed producing good json output.


Andrei



Re: dmd json file output

2013-01-22 Thread Jacob Carlborg

On 2013-01-23 06:45, Walter Bright wrote:


Using the deco string is not missing information - and it's easier to
parse it and manipulate it.


I vote for the suggestion by Rainer:

"type" : {
"mangled" : "PPPi",
"pretty" : "int***",
}

--
/Jacob Carlborg


Re: dmd json file output

2013-01-22 Thread Jacob Carlborg

On 2013-01-22 20:53, Sönke Ludwig wrote:


Consider "int[4u] delegate(scope float*[void function(scope int)] p1, 
Rebindable!(const(C))*[]* b)"

There are actually quite some things to parse in human readable type strings, I 
even remember some
expressions. And parsing this is at least as language specific as the mangled 
name. But I agree that
having both should be a good compromise.


This wouldn't be fun to parse. It basically requires a front end.

--
/Jacob Carlborg


Re: dmd json file output

2013-01-22 Thread Walter Bright

On 1/22/2013 9:42 PM, Andrei Alexandrescu wrote:

On 1/22/13 3:36 PM, Walter Bright wrote:

On 1/22/2013 11:46 AM, Andrei Alexandrescu wrote:

On 1/22/13 2:48 AM, Walter Bright wrote:

On 1/21/2013 10:56 PM, ric wrote:

Would it be reasonable to put an option whether to produce the (too)
verbose
json output or the minimal one?


I'd rather we make a decision.


Verbose should probably be it.


Rationale?


You can always filter out the verboseness with a simple program, but you can't
add missing information.

If the efficiency of generating json ever comes up, _then_ it's worth looking
into an option that produces less verbose output directly. For now be verbose
and let downstream tools filter it out.


Using the deco string is not missing information - and it's easier to parse it 
and manipulate it.




Re: dmd json file output

2013-01-22 Thread Andrei Alexandrescu

On 1/22/13 3:36 PM, Walter Bright wrote:

On 1/22/2013 11:46 AM, Andrei Alexandrescu wrote:

On 1/22/13 2:48 AM, Walter Bright wrote:

On 1/21/2013 10:56 PM, ric wrote:

Would it be reasonable to put an option whether to produce the (too)
verbose
json output or the minimal one?


I'd rather we make a decision.


Verbose should probably be it.


Rationale?


You can always filter out the verboseness with a simple program, but you 
can't add missing information.


If the efficiency of generating json ever comes up, _then_ it's worth 
looking into an option that produces less verbose output directly. For 
now be verbose and let downstream tools filter it out.



Andrei



Re: dmd json file output

2013-01-22 Thread Walter Bright

On 1/22/2013 11:46 AM, Andrei Alexandrescu wrote:

On 1/22/13 2:48 AM, Walter Bright wrote:

On 1/21/2013 10:56 PM, ric wrote:

Would it be reasonable to put an option whether to produce the (too)
verbose
json output or the minimal one?


I'd rather we make a decision.


Verbose should probably be it.


Rationale?



Re: dmd json file output

2013-01-22 Thread Sönke Ludwig
Am 22.01.2013 18:05, schrieb Tove:
> (...)
> 
> "int***" is both compact and easy enough to parse anyway.
> 
Consider "int[4u] delegate(scope float*[void function(scope int)] p1, 
Rebindable!(const(C))*[]* b)"

There are actually quite some things to parse in human readable type strings, I 
even remember some
expressions. And parsing this is at least as language specific as the mangled 
name. But I agree that
having both should be a good compromise.


Re: dmd json file output

2013-01-22 Thread Andrei Alexandrescu

On 1/22/13 2:48 AM, Walter Bright wrote:

On 1/21/2013 10:56 PM, ric wrote:

Would it be reasonable to put an option whether to produce the (too)
verbose
json output or the minimal one?


I'd rather we make a decision.


Verbose should probably be it.

Andrei


Re: dmd json file output

2013-01-22 Thread Tove
On Tuesday, 22 January 2013 at 08:02:26 UTC, Rainer Schuetze 
wrote:



> "type" : {
>  "mangled" : "PPPi",
>  "pretty" : "int***",
> }


I would favour plain "type" : "int***".

Consider it will be parsed from many different languages, C#, 
Java... etc and the generic tools may be able to handle json from 
multiple languages, and in this context have no reason to use 
differently mangled types for different languages.


"int***" is both compact and easy enough to parse anyway.

Even for pure D-based tools, for unit-test reasons it could be 
useful to have the pretty name to compare against, thus Rainer's 
proposal is a reasonable compromise.


Re: dmd json file output

2013-01-22 Thread Sönke Ludwig
Am 22.01.2013 09:02, schrieb Rainer Schuetze:
> 
> Considering function types, the deco does not contain any function argument 
> identifiers anymore, but
> these are very useful for tooltips in an IDE like Visual D.
> 

I thought so, too. But considering that types are always subject to this 
problem:

---
alias StateCallback = void function(int state);
static assert(StateCallback.stringof == "void function(int state)");

alias IndexCallback = void function(int index);
static assert(IndexCallback.stringof == "void function(int state)"); // still 
"state"
---

... it may be better to not even make it possible to fall into this trap by 
excluding them. Except
if I'm wrong and the JSON output happens at an earlier stage where the 
parameter name information is
still tagged to the declaration, of course.


Re: dmd json file output

2013-01-22 Thread Rainer Schuetze



On 21.01.2013 08:27, Walter Bright wrote:

The current version is pretty verbose. For:

 int ***x;

it will emit as the type:

"type" : {
 "kind" : "pointer",
 "pretty" : "int***",
 "targetType" : {
 "kind" : "pointer",
 "pretty" : "int**",
 "targetType" : {
 "kind" : "pointer",
 "pretty" : "int*",
 "targetType" : {
 "kind" : "int",
 "pretty" : "int"
 }
 }
 }
}

I find this to be excessive, and it helps to produce truly gigantic
.json files. I think it's better to just put out the deco for the type:

"type" : "PPPi"

But, you might say, that is not user friendly! Nope, it isn't. But the
.json output is for a machine to read, not humans, and the deco types
are very space efficient, and are trivial to convert to whatever data
structure the reader needs. Much easier than the verbose thing.

What do you think?


I agree the verbose output is overkill.

Considering that the demangling in druntime still has a number of open 
issues (e.g. http://d.puremagic.com/issues/show_bug.cgi?id=3034, 
http://d.puremagic.com/issues/show_bug.cgi?id=6045) and that there are 
ambiguities in the name mangling (e.g. 
http://d.puremagic.com/issues/show_bug.cgi?id=5957, 
http://d.puremagic.com/issues/show_bug.cgi?id=4268), my first reaction 
was that it might be better to provide a function to parse the pretty 
type. It is not too difficult and would be a nice start for the 
lexer/parser topic, but might be burdened with new bugs.


Considering function types, the deco does not contain any function 
argument identifiers anymore, but these are very useful for tooltips in 
an IDE like Visual D.


As a compromise, the type chould just contain the mangled and the pretty 
name:


> "type" : {
>  "mangled" : "PPPi",
>  "pretty" : "int***",
> }




Re: dmd json file output

2013-01-21 Thread Walter Bright

On 1/21/2013 10:56 PM, ric wrote:

Would it be reasonable to put an option whether to produce the (too) verbose
json output or the minimal one?


I'd rather we make a decision.


Re: dmd json file output

2013-01-21 Thread ric
Would it be reasonable to put an option whether to produce the (too) 
verbose json output or the minimal one?


Re: dmd json file output

2013-01-21 Thread Andrej Mitrovic
On 1/21/13, Walter Bright  wrote:
> I think it's better to just put out the deco for the type:
>
> "type" : "PPPi"

It seems the simplest to implement. And core.demangle can be used to
get a string representation, which could eliminate the need for the
'pretty' field?

FWIW the way this is done for C++ typeinfo in .xml files is:

  
  
  
  
  

And then another variable such as PPi would have the type field set to _1.

But it would probably be overkill to try to do this for Json right
now, PPPi is a simple solution.


Re: dmd json file output

2013-01-21 Thread Johannes Pfau
Am Sun, 20 Jan 2013 23:27:57 -0800
schrieb Walter Bright :

> 
> I find this to be excessive, and it helps to produce truly
> gigantic .json files. I think it's better to just put out the deco
> for the type:
> 
> "type" : "PPPi"
> 
> But, you might say, that is not user friendly! Nope, it isn't. But
> the .json output is for a machine to read, not humans, and the deco
> types are very space efficient, and are trivial to convert to
> whatever data structure the reader needs. Much easier than the
> verbose thing.
> 
> What do you think?

How about compressing the json file (lzma)?

Should be just as space efficient, can be easily translated to user
readable output (uncompress), also trivial to read for machines. And it
also compresses the whitespace characters and other text.

https://github.com/D-Programming-Deimos/liblzma


Re: dmd json file output

2013-01-21 Thread kenji hara
Changing output data to mangled name is no problem. It provides enough
informations for the machine readable.

Kenji Hara


2013/1/21 Walter Bright 

> On 1/20/2013 11:50 PM, kenji hara wrote:
>
>> I think there is no problem.
>>
>
> No problem with which scheme?
>
>


Re: dmd json file output

2013-01-21 Thread Walter Bright

On 1/20/2013 11:50 PM, kenji hara wrote:

I think there is no problem.


No problem with which scheme?



Re: dmd json file output

2013-01-21 Thread Walter Bright

On 1/20/2013 11:42 PM, Jacob Carlborg wrote:

Is there any documentation for these, or do we have to find it in the compiler
sources?



The PPPi is documented in the page on the ABI.


Re: dmd json file output

2013-01-20 Thread kenji hara
I think there is no problem.

Kenji Hara


2013/1/21 Walter Bright 

> The current version is pretty verbose. For:
>
> int ***x;
>
> it will emit as the type:
>
> "type" : {
> "kind" : "pointer",
> "pretty" : "int***",
> "targetType" : {
> "kind" : "pointer",
> "pretty" : "int**",
> "targetType" : {
> "kind" : "pointer",
> "pretty" : "int*",
> "targetType" : {
> "kind" : "int",
> "pretty" : "int"
> }
> }
> }
> }
>
> I find this to be excessive, and it helps to produce truly gigantic .json
> files. I think it's better to just put out the deco for the type:
>
> "type" : "PPPi"
>
> But, you might say, that is not user friendly! Nope, it isn't. But the
> .json output is for a machine to read, not humans, and the deco types are
> very space efficient, and are trivial to convert to whatever data structure
> the reader needs. Much easier than the verbose thing.
>
> What do you think?
>


Re: dmd json file output

2013-01-20 Thread Jacob Carlborg

On 2013-01-21 08:27, Walter Bright wrote:

The current version is pretty verbose. For:



I find this to be excessive, and it helps to produce truly gigantic
.json files. I think it's better to just put out the deco for the type:

"type" : "PPPi"

But, you might say, that is not user friendly! Nope, it isn't. But the
.json output is for a machine to read, not humans, and the deco types
are very space efficient, and are trivial to convert to whatever data
structure the reader needs. Much easier than the verbose thing.

What do you think?


Is there any documentation for these, or do we have to find it in the 
compiler sources?


--
/Jacob Carlborg


dmd json file output

2013-01-20 Thread Walter Bright

The current version is pretty verbose. For:

int ***x;

it will emit as the type:

"type" : {
"kind" : "pointer",
"pretty" : "int***",
"targetType" : {
"kind" : "pointer",
"pretty" : "int**",
"targetType" : {
"kind" : "pointer",
"pretty" : "int*",
"targetType" : {
"kind" : "int",
"pretty" : "int"
}
}
}
}

I find this to be excessive, and it helps to produce truly gigantic .json files. 
I think it's better to just put out the deco for the type:


"type" : "PPPi"

But, you might say, that is not user friendly! Nope, it isn't. But the .json 
output is for a machine to read, not humans, and the deco types are very space 
efficient, and are trivial to convert to whatever data structure the reader 
needs. Much easier than the verbose thing.


What do you think?