For the past few months I have been working on managing the Honcho project and its associated SDKs. We’ve been taking the approach of developing the SDK manually as we are focused on trying to find the best developer UX and maximize developer delight.

This has led to a rather arduous effort that has required a large amount of refactoring as we are making new additions to the project, and the capabilities of the platform rapidly expand.

While these efforts have been going on a new player in the SDK generation space dropped on hacker news.

When I first started working on Honcho I did a cursory look at a number of SDK generators, but wasn’t impressed with the results I saw. However, a lot of that was speculative and Honcho was not nearly as mature as it is now.

So spurred by the positive comments in the thread above I’ve decided to do a more detailed look into the space and, also try to develop a better understanding of what approaches are generally favorable in creating API client libraries.

Background

For a full understanding of Honcho I recommend the great Simple Honcho Primer post, but I’ll try to summarize the important details here.

Honcho is a personalization platform for LLM applications. It is infrastructure that developers can use for storing data related to their applications, deriving insights about their data and users, and evaluating the performance of their applications. This functionality is exposed through a REST API interface with the following resource constructs.

|____Apps
|____|____Users
|____|____|____Sessions
|____|____|____|____Messages
|____|____|____|____Metamessages
|____|____|____Collections
|____|____|____|____Documents

So Apps have Users that have Sessions and Collections where Sessions can have Messages and Metamessages and Collections can have Documents.

At the time of writing this post Honcho is being manually maintained with a singular client SDK for the Python ecosystem. The SDK is co-located in the repo.

The SDK is written in an object-oriented style where the top level Honcho object will return lower level objects such as a User object and Session objects. These objects contain the CRUD methods necessary to use them i.e.

from honcho import Honcho
 
honcho = Honcho("Test App Name")
honcho.initialize()
 
user = honcho.create_user("username")
session = user.create_session()

There is an Async version of the SDK with an AsyncHoncho class that uses objects such as AsyncSession and AsyncUser.

Guiding Questions

Before evaluating the below platforms I wanted to investigate a few questions I had about how to design SDKs and how they are generally maintained in other organizations. I’ve also included some questions I want to think about when looking at the different platforms

I’m doing this through the lense of a Python developer as Honcho currently only has a Python client library with plans to quickly expand to other ecosystems.

General SDK Questions

  1. Do developers prefer an object-oriented approach or singleton approach for client SDKs that wrap a REST API?
  2. What additional features are generally important in SDK design outside the functionality of the API (retry, pagination, etc.)?

Platform Specific Questions

  1. How readable and easy to understand is the generated code?
  2. How customizable is the end result?
  3. How easy was it to use the tool?
  4. What approach does the tool take? Object-oriented or singleton?
  5. How does it handle async vs sync interfaces?

Research

First I took a look at sources and posts onlines that talk in general about developing SDKs. This isn’t an exhaustive look at every link I looked at, but ones I thought were relevant. The notes are messy and not necessarily fully formed sentences.

Any design patterns and tips on writing an API client library

Things they are laying out here.

One person

  • Auth is really hard to figure out
  • Retry logic and pagination is really important

Another person

  • Keep data objects as just data and use other objects for transformations

^ basically advocating for the singleton model

Person 3

  • Also arguing for singleton approach. Made a good case where if you really only care about lower level stuff it’s annoying

Firstly, don’t go over the top with object-oriented modelling. If you’ve got an API call like:

GET /locations/12345/customers/65432/orders/87678768 Don’t implement this as:

client.location(12345).customer(65432).order(87678768).get()

Just implement:

client.get_order(12345, 65432, 87678768)

that last one is better tbh it’s just managing that data isn’t done within the object, which is my main problem.

So arguments for singleton approach are

  • harder to go to lower levels from the start

The object-oriented approach looks more readable.

A Design Pattern for Python API Client Libraries

It mainly covers how to build an singleton library but has this one snippet at the end.

Other types of APIs This pattern works well for RPC-style APIs, but tends to break down for more object-based or RESTful APIs as having a single interface class gets messy quickly. In those cases I find it makes more sense to break the interface down to resource-level, modeling things more like an ORM. I’ll cover that in a later post, next time I find the need to build one.

At the time of this research there was no follow-up post.

Ask HN: Best practices (and examples) for designing client libraries for APIs?

The first comment actually advocates for an object-oriented model but just using the top level client object for authentication and setup stuff.

Most of the sentiments kind of make me think using an object-oriented model might make more sense.

How to design a good API and why it matters

Nothing really to note from there. It’s more about the API itself and not the SDK.

Building A Creative & Fun API Client In Ruby: A Builder Pattern Variation

This is basically a guy who saw an singleton approach and said I want an object oriented approach.

How to design your API SDK

A developer from twilio talking about their approach to creating helper libraries and client libraries.

A point he makes is that “If you’ve designed your API in a RESTful way, your API endpoints should map to objects in your system”

This point isn’t explicitly asking for the object-oriented approach as the singelton approach just moves the verbs to the singleton, but usually still has data only objects for the different resources.

I say this, but the examples seem to use an object-oriented model.

How to build an SDK from scratch: Tutorial & best practices

Written by one of the SDK generation platforms.

It talks in general terms about creating data objects and mapping methods to endpoints. One of the points is suggests as a good grouping method is to group functions in service classes, essentially advocating for an object-oriented model.

Designing Pythonic library APIs

The two takeaways that are the most important to me when looking at these are

  • Design your library to be used as import lib … lib.Thing() rather than from lib import LibThing … LibThing().
  • Avoid global state; use a class instead

From that it seems using a singleton for are actions/verbs and then storing data in dataclasses would support both of the requirements. The examples in the post show a class that has functionality.

Using tree-shaking style imports should also allow for lower scopes. For example when only worrying about messages for a particular session in honcho a user could import just the messages namespace i.e.

from honcho.apps.users.sessions import messages
 
messages.create("sample")

so there are pythonic ways to make the code less verbose. However the benefit of having the entire string is making it clearer what messages are being discusses. Are these Honcho mesages? LangChain messages? It can get messy that way especially in the LLM space where many libraries and components are converging on similar naming schemes.

Build a Python SDK

Looks like a guide made by Cisco. I paid special attention to the “API Wrapper Module” section. It was a really barebones example in this guide that just implemented a very small client and put most of the attention on how to manage the connection logic.

It used one singleton object that had all the methods available for the API. There was no concept of resources or data objects here as no data was being persistently stored.

How to build a user-friendly Python SDK

Noticing the trend of abstracting all connection logic for http requests to a separate module and havign reusable methods for different http functions.

Main focus of the post was just on good practices of documentation, testing, and logical organization.

SDKs.io

A more comprehensive repository of thoughts and principles around SDK design. Made by APIMATIC. which seems to be another player in the code generation space.

I paid special attention to the Build section under Best Practices, and specifically the endpoints to methods and the models & serialization.

They state putting all methods in a single class (singleton) has the advantage of reducing the need to initialize classes, but can make the class size very large if there are many endpoints.

Grouping methods into different namespaces could probably remove this problem too. A nested singleton can reduce the confusion, while still not needing to mess with classes and objects.

It generally seems popular to at the very least create types and data objects for handling and storing API responses. They help with readability, type hints, data validations, etc. Regardless of the singleton or object-oriented approach data objects are something that should probably still be included.

Generating SDKs for your API

Advocates for using generators for making SDKs and talks about how different languages have different idioms and conventions that will be hard to manage.

Also mentions having the generator create data models.

Guiding Principles for Building SDKs

Some key insights

  • Make sure documentation is very comprehensive
  • Try to minimize external dependencies
  • Have modular design patterns that make it easy to extend and pick and choose features.

Should I implement OOP in a REST API?

Most people seem to be saying a full OOP method is overkill, but there are people advocating for having a controller class with methods that take data objects as inputs. Essentially advocating for the singelton approach with data only objects.

Analysis

Many of the generic concerns of SDK design do not have to do with the UX of the SDK for the end developer, rather background processes that an SDK handle. This includes:

  • Authentication
  • Retry Logic
  • Pagination
  • Logging

When it comes to the actual developer experience and interfaces for interacting with the SDK the community seems a bit split. This is very much because of the boring fact that REST APIs are designed very differently and so it depends on the specifics of the API.

Some APIs have many resources with basic CRUD operations. Others have many different endpoints, but only have a few resources. The singleton architecture vs a strict object-oriented approach again seems to depend a lot. Some sources advocate for a strict object-oriented approach where classes have their own methods, while others advocate for a singleton approach stating objects are overkill.

However, the singleton approach doesn’t completely abandon the idea of objects and almost always advocates for data objects, or some kind of models that can be used for type hints and validation.

There is some tradeoff regardless with problems arising at different levels of scale. The singleton approach could be verbose and cumbersome at smaller scales, but the object-oriented approach may not be a readable, and it could be unclear what methods are doing in complex codebases. Even GPT-4 couldn’t decide between the two.

Asking GPT-4 about Singleton vs Object-Oriented
Approaches

Again and again, the best way to approach SDK development is to just do whatever is easier, and create tons of documentation that will help developers navigate your API Ladder. Someone will get confused regardless of what you do, so the key is to make sure the SDK makes sense (even if it’s not the most efficient or clean) and remove hurdles for users to navigate errors and mistakes.

SDK Generation Platforms

With a sense of the best standards for SDK design and additional features that should be supported in the SDK I want to look at a few different options to determine what is the best solution to go with.

Below is a list of the different platforms I wanted to review

I was using the OpenAPI Spec for Honcho that was housed at https://demo.honcho.dev/openapi.json.

Stainless

Since the hacker news thread for the release of stainless is what spurred this research I decided to try them out first.

From their web portal they were able to take a link to the OpenAPI spec and generate a NodeJS and Python SDK immediately. There was no tweaking or anything necessary.

I mainly paid attention to the Python SDK. The code was very readable and made sense. I also liked how it used httpx and pydantic by default and made an async version of the interface. They took the singleton approach to the design of the interface. There was also built in capabilities for retries, pagination, and auth.

There’s also capability for adding custom code such as utility functions.

Speakeasy

Speakeasy required me to do everything locally through their brew package. It did not immediately accept the OpenAPI Spec and required me to make some tweaks. These were low-hanging fruit, and their cli has a handly AI tool that will diagnose the issue and tell you what to fix.

I just had to add a list of servers and deduplicate some routes. I’m happy it found these errors, but there was some friction for me to get started. Stainless just worked out of the box and made some logical assumptions.

The generated SDK didn’t feel as strong as the stainless one. There didn’t seem to support async methods, it did not use pydantic and used the built-in Python @dataclass. The methods had really unwieldy names, and looked like it would need a lot of tweaking to get it more production ready.

Liblab

Also had me do the generation from the cli using their npm package. It was pretty straightforward to login and give it an API spec. Liblab seems to require a lot tweaking to get better results. It gave me several warnings asking me to add tags to my API Spec. I did not add them and went ahead to look at the generation.

I’m not opposed to adding the tags if necessary, but I was able to get good results without adding them on other platforms.

The results were also lack luster. The SDK took the singleton approach as well, which seems to be the industry standard for codegen tools. The method names were also unwieldy. It also didn’t make use of pydantic and instead implemented its own BaseModel class. It was built on the requests model and doesn’t seem to support async methods.

OpenAPI Generator

This is the only one on the list that is not expressly backed by a company whose main goal is SDK generation. It is however a very popular project with many sponsors.

Again, I tried to generate a client from the cli using their npm package. I used version 7.5.0 and once again gave it my API Spec. It gave a few warnings about OpenAPI Spec v3.1 not being fully supported yet, but generated a package either way.

I again was not too impressed with the results, however I did like it more than liblab. The method names were also unwieldy, and the project relies on urllib3. I did not see an indication of support for an async client.

The repo did use pydantic for typing and data classes, which is a plus. Once again, the sdk use the singleton approach.

I also did not see any indication of functionality for retry logic, authentication, or pagination.


Conclusion

Overall, Stainless had the results that I liked the most. With almost no work from me, it produced a high quality SDK that designed things in a sensible way with many built-in features such as retries, pagination, and auth.

All the platforms took the singleton approach with a host of data models so there isn’t much to compare in that regard.

The other platforms did not produce anything unusable, but they seemed to use less modern features and require a lot more massaging to get a desirable result.

The docs for stainless also looked more clear, and it seems easier to add customizations after the fact.

I will give Speakeasy some kudos for having documentation for different API frameworks. The FastAPI one made it easy to figure out what I needed to tweak and how to do it. The AI debugging feature was also a nice help.

What I’m looking for right now is the platform or tool that can reduce my work the most and let me focus on other things and stainless achieved that. The results are not perfect, but it doesn’t look like it’ll need more than some slight tweaking and testing to get to a state I want.

Future Work

I want to see if I can combine an SDK generator platform with something like GritQL to automatically change the generated results to be in the format I’d like (singleton to object-oriented?)