What are the top tools / vendors in the detection of Software IP infringement?
Who are the top vendors, and what are the top tools that would allow me to scan through 2-4 million lines of Java, C#, and C++ code, and determine if there were any potential infringements with 3rd party libraries or other code?
Where does Protecode rank, for instance?
Answer this question
@Andrew, Bob is probably right. Some experts are writing about the subject and gave advices for some software vendors - Michael Barr is giving here a quite extended opinion on those pieces of software.
Michael is pointing Bob's CodeMatch (part of CodeSuite from S.A.F.E), as one of the most powerfull tools. (About CodeMatch, you may want read Bob's book "The Software IP Detective's Handbook: Measurement, Comparison, and Infringement Detection".)
According to Michael, BlackDuck and Protecode are also valuable, when it comes to check your tested code against all possible OSS, while CodeMatch or BitBatch are effective in comparing your code to the suspected source (not sure what SourceDetective from SAFE really does).
You may want to combine those tools to match your requirements.
In the sofware industry since years, and as far as I am aware of, there was no formal ranking done on the 70 pieces of software Bob mentionned, there is anyway a Gartner Report (I did not read ... decide if you trust Gartner) that looked at this market niche, but the table of content does not mention CodeMatch... (you need to login to Gartner to see the table of content).
Andrew -- I developed a rule set for XTRAN, our computer language expert system, to find cloned code, using a two-pass (coarse and fine) code "signature" approach at the function level. The same approach can be used to find the inclusion of 3rd party library code. It would take three steps -- a) generate the coarse signatures for the 3rd party code, and b) run that against the subject code. If any hits are found, c) run the fine analysis on those modules to nail it down.
We ran the rule set on 2/3 million code lines of a client's code, and it found 40% potentially cloned code! (That turned out to be about right, according to the client.)
OTOH, if you're looking to find invocations of 3rd party APIs, that's even easier, because XTRAN has pattern-matching (and replacement) facilities built into its rules language. It would be trivial to create rules that look for and report any function calls to the APIs in question. Ditto any use of data types and classes those APIs require.
Information is at http://www.xtran-llc.com/