Solr Json Facet API

May 12, 2017

By Chakravarthy Yeleswarapu

I’d start off my Solr blog series with Json facet API. Because this API is watershed between Solr 4.x and Solr 5.x versions. Prior to Solr 5.x statistics/analytics capability was done using stats component this included facet capabilities. However, using stats and faceting was kludgy and inefficient. What I mean by this is, for example, consider following json structure.

Solr-Json-facet_API

You’d want to facet by category, sub-facet by inStock and along the way obtain price statistics such as min/max.

We could use Solr’s stats & pivoting component to achieve this.

1

2

3

4

5

http://localhost:8983/solr/techproducts/select? q=*:*&

stats=true&

stats.field={!tag=piv1 min=true max=true}price&

facet=true&

facet.pivot={!stats=piv1}category,inStock

Depending on the number of documents, the above query will be in the order of seconds, which does not help many use cases. For web reporting application that respond to user’s clicks, this sort of latency in seconds may be unacceptable.

This is where Json facet API comes in handy. The above facet-pivoting query can be rewritten using this new API as follows.

Find top 5 categories and return number of products in-stock and out-of-stock
Solr-Json-facet-API
Similarly, to Find top 5 categories and return manufacturers under each category just change field:manu

Solr-Json-facet-API

Above approach has several advantages.

  1. The QTime for this query, even for millions of records would be in milliseconds.
  2. The query formation is quite intuitive, because json structure provides clarity when compared to local tags used in facet-pivoting approach.
  3. Applications can programmatically manipulate json.facet structure at runtime, possibly responding to user clicks.
  4. Each json.facet structure can be logically maintained per use case, separate from the Solr query and combined together at runtime.
  5. For Java based API implementations, Solrj client library has great support in SolrCloud mode as well.

The only caveat is, if cardinality of field values (or unique values in each field) exceeds 100 per shard, then estimation algorithm kicks in. There is already a Jira for this issue that could be addressed in future releases.

Update: There is a patch committed for Jira. Perhaps this should rollout in version 6.5 or so.

Explore Additional Resources

Learn how you can select the best Digital Experience platform for your business.

If you’d like to learn more about Adobe Experience Manager, visit our Adobe Practice.


GET HELP FROM OUR EXPERTS

Over the past 19 years, we have completed thousands of digital projects globally. We have one of the largest and deepest multi-solutions digital consulting teams in the world. Our proprietary processes and years of Digital Experience expertise have earned us a 97% customer satisfaction rating with our clients ranging from Global Fortune 1000 to Mid-Market Enterprises, leading educational institutions, and Non-Profits.

Contact us if your organization needs assistance with Solr search.

About TA Digital

TA Digital is an innovative digital transformation agency, specializing in delivering the digital experience, commerce, and marketing solutions. For nearly two decades, we have been helping traditional businesses transform and create dynamic digital cultures through disruptive strategies and agile deployment of innovative solutions.

Tags: